Abstract
Multidrug-resistant bacterial infections are a rising threat to human health and currently account for 1.3 million deaths annually. Notably, 70% of these deaths are due to gram-negative pathogens, and no new classes of gram-negative-active antibiotics have been approved by the US Food and Drug Administration in the past 55 years. The challenges of converting compounds with in vitro biochemical activity to whole cell gram-negative antibacterial activity are significant, as the outer membrane and promiscuous efflux pumps thwart the potential of most antibiotic candidates. Significant strides have been made toward understanding compound penetration and accumulation in gram-negative bacteria, but efflux remains a major obstacle for antibiotic drug discovery. Recent advances in machine learning (ML) algorithms and increased accessibility of code and programs for the nonexpert suggest artificial intelligence could help address the efflux problem. Here, we discuss work toward understanding efflux and cast a vision for how ML can be utilized to address compound efflux from gram-negative bacteria.
Introduction
Antibiotic-resistant bacteria, deemed the “silent pandemic,” 1 account for over 5 million infections annually, and are one of the top 10 threats to human health.2–4 In the United States alone, health care-associated costs for antibiotic resistance is estimated to be upwards of $4.6 billion annually.2,4,5 Recently, the World Health Organization updated the list of the top bacterial pathogens posing a significant threat to human health, and all “critical” priority pathogens are gram-negative, including drug-resistant Acinetobacter baumannii and Enterobacterales (encompassing Escherichia coli, Klebsiella pneumoniae, and Enterobacter cloacae).6,7 Resistance has been observed to every clinically used antibiotic. 8
The dearth of novel antibiotics is tied to the challenge of converting compounds with in vitro enzymatic activity to compounds with whole-cell activity, due to the impermeability of the gram-negative outer membrane and promiscuous efflux pumps that expel most drugs.9–11 Additionally, advancement of novel chemical matter has also been challenging, and as such most antibiotics in later stages of the clinical pipeline are derivatives of known antibiotic classes (e.g., β-lactams), suggesting they will have similar resistance problems.12–15 Innovative strategies have been employed to identify new antibacterials,16–20 and these new approaches, together with historical screening efforts, have led to an arsenal of preclinical antibiotics that hit a diverse array of biological targets. Most of these compounds are only effective against gram-positive bacteria; although, importantly, they would be active against gram-negative bacteria if they could get into the cell and engage their target before being extruded by efflux. In this perspective we outline this “efflux problem,” highlight important advances toward understanding efflux in gram-negative bacteria, and speculate about how machine learning (ML) could be enlisted in this effort.
Efflux Pumps in Gram-Negative Bacteria
There are six families of efflux pumps in gram-negative bacteria, characterized by their protein sequence and energy source (ATP-driven or proton motive force driven).11,21 Most efflux pumps exist in the inner membrane, pumping compounds from the cytoplasm to periplasm, where compounds and xenobiotics are then recognized by promiscuous tripartite efflux pumps in the resistance nodulation division (RND) or ATP-binding cassette family and pumped out of the cell. 11 Tripartite efflux pumps are the main efflux pumps attributed to antibiotic resistance due to their multiple binding pockets, polyspecificity, and fast kinetics to fully remove compounds from the cell22,23; however, recent data suggest inner membrane efflux pumps are not to be discounted, and also contribute to resistance (Fig. 1). 24 Tripartite efflux pumps are composed of three components—an inner membrane component anchored in the inner membrane where compounds bind and enter the pump, a membrane fusion protein that traverses the periplasm and provides structural stability, and an outer membrane component, which serves as a channel for compounds to exit the cell. 11 In E. coli, the main constitutively expressed efflux pump implicated in resistance is AcrAB-TolC, where AcrB is the inner membrane component that determines substrate specificity and TolC is the outer membrane component of all tripartite pumps. 25 Pseudomonas aeruginosa and A. baumannii have more complex efflux landscapes, with significantly more tripartite efflux pumps that can be expressed under antibiotic stress and multiple outer membrane components for the tripartite complexes.11,26–28 Detailed summaries on the polyspecificity of efflux pumps and the mechanism of compound extrusion have been published, and interested readers can consult these reviews.11,21,25 In gram-negative pathogens, efflux pumps are held under tight genetic regulation and can be upregulated to enable survival under antibiotic stress, with multiple compensatory mechanisms to regulate efflux.11,29,30 Efflux pump upregulation has been noted clinically and is a key intrinsic antibiotic resistance mechanism,31–33 making it important to understand ways to evade efflux.

Efflux pumps in gram-negative bacteria. Inner membrane efflux pumps are monomeric, embedded in the inner membrane, and pump compounds from cytoplasm to periplasm. Tripartite efflux pumps are trimeric (composed of three components) and pump compounds completely outside the cell. The inner membrane component of tripartite efflux pumps has multiple binding pockets and can extrude compounds from the outer leaflet of the inner membrane, periplasm, and cytoplasm. For a more detailed review on efflux pumps, see Du and coworkers 11 and Alav and coworkers. 25
Compound Accumulation in Gram-Negative Bacteria, Influx, and Efflux
Net accumulation into gram-negative bacteria requires a compound to penetrate the outer membrane faster than it is removed by efflux. Most antibiotics (including those engaging both cytoplasmic and periplasmic targets) have significant efflux liabilities, as demonstrated by a significant potentiation in antibacterial activity (measured by minimum inhibitory concentration [MIC]) upon knockout of proteins involved in efflux (Table 1). These data suggest that many antibiotics would be effective against gram-negative pathogens if they could evade efflux; interestingly, even some gram-negative-active antibiotics have a significant efflux liability (Table 1), suggesting these antibiotics could be made even more potent if efflux-evading versions could be identified.
Antibacterial Activity of Compounds in Wild-Type and Efflux-Deficient Escherichia coli Suggests Most Antibiotics Have Efflux Liabilities
MIC, minimum inhibitory concentration.
The advent of modern LC-MS/MS technologies has enabled an unbiased look at compound accumulation in gram-negative bacteria; that is, virtually any compound can be analyzed by this method as it is agnostic of antibacterial activity and does not rely on inherent fluorescence of compounds.50–53 Assessment of accumulation of large numbers of diverse compounds through this method have led to principles to guide the design of high-accumulating compounds—the eNTRy rules for E. coli9,48 and the PASsagE rules for P. aeruginosa.54,55 Inspection of the results from these studies, and of the antibiotics that have subsequently been developed with these principles, indicates that the chemical traits identified are largely promoting influx.43,47,48,54–61 Follow-up experimental studies are consistent with this thinking and suggest that eNTRy rule-compliant compounds have their accumulation facilitated by increased diffusion through porins48,62–66; as shown in Table 1, eNTRy rule-converted antibiotics, while active against wild-type gram-negative bacteria, still have efflux liabilities (although in some cases they are reduced). Identification of chemical traits that facilitate efflux evasion would enable antibiotic design, potentially leading to marked improvements for compounds that already possess antibacterial activity as well as engineering potency into those that do not. Another method to reduce efflux would be to co-dose antibiotics with an efflux pump inhibitor to slow down or stop extrusion of the compound. Although efflux pump inhibitors developed to date have not been clinically approved, this is a burgeoning field of research with high-impact potential and interested readers should consult reviews on this topic,67–70 and recent work toward structure-based design of inhibitors.71,72
Development of Efflux-Evading Antibiotics Through Derivative Synthesis
Tigecycline
Given the lack of fundamental understanding of chemical traits that enable efflux evasion, not surprisingly there are scant examples in the literature of successfully removing efflux liabilities from an antibiotic. If the main efflux pump responsible for extruding the molecule of interest can be identified, in principle, modifications to the antibiotic could be made to disrupt binding to the efflux pump, enhancing antibiotic activity. The signature example of this approach is the development of tigecycline from tetracycline over a 50-year stretch. 73 Tetracycline, first discovered in the 1940s, is a broad-spectrum antibiotic that targets the 30S subunit of the ribosome, and resistance in E. coli arises upon mutations in the ribosome or overexpression of the Tet efflux pumps (Fig. 2, Table 2). 73 Knowing the Tet efflux pumps were a key contributor to resistance, extensive structure–activity relationship (SAR) studies were performed and derivatives screened against Tet-overexpressing strains of E. coli to identify derivatives that could overcome these resistance mechanisms. While the mechanism of expulsion by the Tet efflux pumps and binding site is unknown, SAR studies identified substitution at the 9-position of the tetracycline core as effective in mitigating efflux, and 9-(t-butylglycylamido)-minocycline (later named tigecycline) was identified (Fig. 2).73,74,76 In strains overexpressing Tet efflux pumps (including clinical isolates), tigecycline still shows single-digit MIC activity, with only a twofold increase in MIC due to efflux pump overexpression, compared with tetracycline that loses all activity (MIC ≥32 µg/mL) (Table 2, top). 74 Tigecycline demonstrates similar efflux liabilities as tetracycline through the tripartite efflux pumps AcrAB-TolC and AcrEF-TolC in E. coli; however, these efflux liabilities are minimal and only result in a one to fourfold increase in MIC compared with the Tet efflux pumps, which render tetracycline inactive (Table 2, bottom). 75 Recently, resistance to tigecycline has been observed in clinical isolates of Enterobacteriaceae due to a plasmid-borne RND efflux pump, tmexCD-toprJ.77,78 This resistance mechanism is unique as RND efflux pumps are typically chromosomally encoded and not on mobile genetic elements; however, MICs of tigecycline only increase by four- to eightfold in tmexCD-toprJ expressing strains, similar to efflux through E. coli RND efflux pumps.77,78

Tigecycline was identified through a medicinal chemistry campaign and demonstrates decreased efflux liabilities in strains overexpressing Tet efflux pumps and clinical isolates. Representative MICs in wild-type (WT) Escherichia coli ATCC 25922 and a E. coli Tet (A) overexpressing strain; full MIC panel in Table 2. 73 , 74 Tetracycline and tigecycline demonstrate similar efflux liabilities through RND efflux pumps measured by MIC fold change between WT E. coli (KAM3) and E. coli ΔAcrB (MIC values in Table 2). 75 MIC, minimum inhibitory concentration; RND, resistance nodulation division.
MIC Values of Tetracycline and Tigecycline in Lab Strains and Clinical Isolates of Escherichia coli
Top: Lab strains overexpressing genes encoding Tet efflux pumps 74 used to identify the Tet efflux pumps as critical for extrusion of tetracycline from E. coli. Bottom: Lab strains engineered with plasmids expressing RND efflux pumps in KAM3 75 (AcrB knockout strain) demonstrate the minimal effect the Acr family of efflux pumps has on the activity of these compounds. For clinical isolates, a panel of 32 E. coli isolates were assessed and MIC90 was determined, 74 which is the MIC where 90% of strains are inhibited.
RND, resistance nodulation division.
In contrast to E. coli where the main efflux liabilities lie in Tet efflux pump overexpression, in P. aeruginosa, tigecycline is a substrate of the RND tripartite efflux pump MexXY-OprM. 79 Upon deletion of MexXY-OprM, other RND efflux pumps (such as MexAB-OprM and MexCD-OprJ) are overexpressed to compensate, decreasing tigecycline susceptibility. 79 Interestingly, the MIC shift for tigecycline in efflux pump overexpressing strains is 16-fold compared with tetracycline, which decreased in activity by up to 64-fold (Table 3). 79 This observation suggests that tigecycline has reduced efflux liabilities compared with tetracycline across multiple species of gram-negative bacteria.
MICs of Tetracycline and Tigecycline in Pseudomonas aeruginosa Demonstrate Both Have Efflux Liabilities Through RND Efflux Pumps in P. aeruginosa, Although Slightly Different Efflux Profiles 79
Nitrothiophene carboxamides
In a more recent example, the biopharmaceutical company Bugworks built a pipeline to use structure-based drug design to strategically engineer out efflux in compounds with minimal wild-type gram-negative activity, potent activity in efflux-deficient strains, and easily defined efflux profiles. 80 They performed a high-throughput screen of a collection of compounds against wild-type E. coli and strains lacking AcrB or TolC, to identify compounds with efflux liabilities through AcrB alone (demonstrated by similar MIC values in the E. coli ΔAcrB and ΔTolC strains). Compounds that were solely AcrB substrates were chosen to alleviate concerns about upregulation of compensatory efflux pumps that could negate any perturbations to efflux in this study. 80 Compounds that demonstrated activity in efflux-deficient E. coli and narrow efflux liabilities were advanced to in silico structure-based design using docking studies with AcrB. 80 A nitrothiophene carboxamide compound (Fig. 3A, compound 7) was identified through the initial MIC screen, 80 and docking studies were performed, identifying key interactions between the western benzene ring and the phenylalanine binding pocket of AcrB (Fig. 3B). MIC activity of the compound in the presence of PAβN, an efflux pump inhibitor with a known binding site in AcrB, 81 suggested the compound binds in the phenylalanine-lined distal binding pocket, a common binding site for lipophilic antibiotics.11,82,83 Thermal shift assays and Nile Red efflux assays further validated the hit as an efflux substrate through AcrB as predicted by their primary screen and in silico studies. Strategic modifications on the scaffold to disrupt binding interactions in the distal binding pocket (via fluorination) abrogated efflux liabilities (Fig. 3A, compound 12) with the wild-type MIC elevated only two- to fourfold relative to MICs in the absence of efflux. 80 This is an outstanding example of using structure-based drug design and in silico docking to predict binding to efflux pumps, leading to an efflux-evading derivative. These case studies demonstrate the potential of strategically modifying small molecules to evade efflux, work that would be greatly facilitated by a holistic understanding of physicochemical properties correlated with efflux evasion.

Screening of a compound collection identified a nitrothiophene carboxamide scaffold with efflux liabilities that were engineered out via structure-based drug design.
80
Attempts to Understand Efflux Trends Using Antibacterial Activity Datasets
Beyond these targeted SAR campaigns, cheminformatic tools have been applied to try to identify general properties that correlate with efflux avoidance. In an attempt to understand structural features correlated with gram-negative antibacterial activity, AstraZeneca performed an analysis of >3,000 active compounds from antibacterial high-throughput screens and noted that efflux-evading compounds tended to be small polar molecules and large zwitterionic compounds. 84 This study was followed up by a deeper retrospective analysis of AstraZeneca’s screening collection to determine traits for whole-cell gram-negative activity and efflux evasion, which identified compounds with lower cLogD7.4 values as less susceptible to efflux, although they noted that lipophilicity alone was not predictive of efflux. 85 Both studies necessarily relied on antibacterials, they investigated a limited number of physicochemical properties, and the datasets contained disproportionate numbers of β-lactam and fluoroquinolone antibiotics, making it difficult to identify generalizable trends actionable for the development of efflux-evading antibiotics.
To utilize more complex cheminformatics to identify trends in efflux avoidance, the Zgurskaya lab analyzed the antibacterial activity of a collection of fluoroquinolones and β-lactams in hyperporinated strains of P. aeruginosa and E. coli, where the permeation barrier has been removed through induction of a modified siderophore transporter protein to form pores in the outer membrane. 36 Using the collection of antibiotics and MIC fold changes between wild-type and permeabilized or efflux-deficient strains, 142 physicochemical properties were calculated and a random forest classification model identified traits that correlated with uptake and efflux for E. coli and P. aeruginosa. 42 These descriptors were predominantly charge-based and the authors noted that the main driver of efflux in E. coli was lipophilicity (where lower SLogP values correlated with decreased efflux). 42 In P. aeruginosa, increased positive charge led to increased levels of permeation; however, increases in relative positive charge also increased efflux liabilities. 42 This work has been expanded into studying uptake and efflux of other antibiotic classes (such as the oxazolidinones 59 ), understanding efflux pump inhibitors and avoiders in P. aeruginosa72,86,87 and A. baumannii,59,88 as well as continued work toward understanding efflux trends in E. coli. 72 A crucial output of these studies was the development and subsequent sharing of the hyperporinated strains (discussed further below), allowing multiple research groups to use them in their experiments.
To expand beyond FDA-approved antibacterial compounds and their derivatives, Eric Brown’s lab screened 314,000 compounds for antibacterial activity against wild-type E. coli and E.coli ΔTolC, and utilized principal component analysis and a random forest classification model to identify physicochemical properties that correlate with antibacterial activity and efflux. 89 This work also identified lipophilicity parameters as important for evading efflux, where compounds with lower cLogP or cLogD7.4 values had decreased efflux liabilities. Most recently, the Zachariae lab analyzed MIC data in E. coli of 74,000 compounds from the CO-ADD database and performed matched molecular pair analysis for ∼1,000 gram-negative active compounds to identify small structural changes that define efflux substrates and avoiders, in addition to structural changes that render compounds inactive. 90 The addition of positively charged moieties often converted compounds to nonsubstrates, and certain functional groups such as ketones, aldehydes, and aromatic alcohols often aided in efflux pump recognition, suggesting lipophilicity as a key contributor to efflux. 90
Advances in ML and cheminformatic analysis, as well as a push toward increasing accessibility of written code and programs for the nonexpert user, have allowed for the development of more comprehensive physicochemical descriptors and a better understanding of the properties correlating with compound efflux. While no fully developed rules for designing out efflux have been codified, these studies provide a framework for analyzing large datasets to extract physicochemical properties. Currently, these large datasets are antibacterial activity-based, requiring compounds to display measurable cell death in MIC assays, limiting the diversity of compounds that can be studied and making it challenging to parse out properties that correlate with antibacterial activity (engagement with a bacterial target) from properties that correlate with efflux liabilities. Additionally, some of the physicochemical properties identified as correlated with efflux are esoteric, 91 making them challenging to apply in medicinal chemistry campaigns.
Conquering Efflux: A Final Hurdle for Antibiotic Design
Tools and assays to measure efflux
Efflux studies to date have relied on a small set of tool strains and assays to assess efflux and infer liabilities. Historically, efflux pump knockout strains, particularly E. coli ΔAcrB or ΔTolC and P. aeruginosa ΔMexAB or ΔMexXY, have been used to assess compound activity to compare with wild-type or clinical isolate activity to understand the role of efflux. 92 To expand the toolkit of genetic knockout strains, the Zgurskaya lab developed sets of hyperporinated strains for multiple pathogens (including E. coli, A. baumannii, and P. aeruginosa) to parse out the role of permeation and efflux on compound activity.23,36,93,94 These strains contain an inducible modified siderophore protein (FhuAΔC/Δ4L) with an internal diameter of 2.4 nm that is engineered into wild-type and efflux-deficient strains to create a pore for compounds to easily cross the outer membrane, generating a panel of four strains that can be used to systematically evaluate the contributions of efflux and permeation to compound accumulation (Fig. 4). These strains have been widely used59,72,95–99 and the technology can be applied to other gram-negative species. Recently, a new system was developed by the Cox lab where all known efflux pumps in E. coli were knocked out (EKO-35) and the pore protein could be induced. 24 Additionally, plasmids expressing single efflux pumps were introduced to the EKO-35 and EKO-35-Pore background to understand the role of single efflux pumps on activity and bacterial physiology, creating a rich strain collection to explore hypotheses around efflux liabilities of small molecules, substrate scope and genetic regulation of efflux pumps, and the role of efflux pumps on cellular homeostasis. 24

Hyperporinated strains developed by the Zgurskaya lab allow for systematic evaluation of the contributions of permeation and efflux. 36 Induction of a modified siderophore transporter porinates the outer membrane and facilitates rapid compound uptake; deletion of efflux pumps or efflux pump components removes the efflux barrier. This technology has been applied to Escherichia coli (through deletion of TolC), 36 Pseudomonas aeruginosa (through deletion of 6 tripartite efflux pumps), 133 and Acinetobacter baumannii (through deletion of three key RND efflux pumps).88,133
In addition to the expansion of strain-based tools to explore efflux, new assays and creative approaches are needed to assess efflux liabilities beyond MIC and antibacterial activity assays. As highlighted above, MIC assays only account for changes in antibacterial activity and cannot distinguish between perturbations in uptake, efflux, or target engagement. Additionally, MIC assays have poor sensitivity, limiting the ability to detect small changes in accumulation and activity. If efflux could be measured in an activity-agnostic manner, large datasets could be collected on nonantibiotic compounds, allowing a much greater range of scaffolds and physicochemical properties to be explored for understanding efflux. Outside of MIC measurements, current technology for assessing efflux in a medium-throughput manner involves fluorescence assays using known efflux substrates such as ethidium bromide, Hoechst dyes, or Nile Red.92,100–103 While these assays allow nonantibiotic compounds to be assessed, they suffer from limited sensitivity, can misclassify compounds as poor substrates, or can suffer from interference from background fluorescence from the screening collection. A direct and sensitive method to understand and quantify efflux liabilities would enable a fuller understanding of small functional group changes on efflux liabilities and a more comprehensive view of efflux.
Advances in LC-MS/MS-based accumulation assays, 104 biochemical and biophysical assays with functional efflux pumps (such as proteoliposome 105 or nanodisc 106 assays), biorthogonal tagging assays,50,107,108 and imaging50,52,109,110 all provide avenues to measure efflux (through use of appropriate strains), and many of these approaches could be adapted for medium- to high-throughput screens. Creative applications of these technologies could enable the generation of large datasets of efflux liabilities for a diverse collection of compounds beyond the previously tested antibacterials, a strategy that would parallel the recent advances in understanding compound accumulation trends in gram-negative bacteria by quantifying compound accumulation with an LC-MS/MS-based assay.48,54 ML and cheminformatic algorithms are limited in utility and accuracy mainly by the training sets and the data available. Development of accumulation and efflux assays to assess large numbers of compounds would facilitate ML and could lead to discovery of chemical traits that enable efflux evasion.
Species beyond E. coli
Most studies toward understanding efflux have focused on E. coli, due to the large collections of tool strains developed and the ease of using these strains. Additionally, efflux pumps are highly conserved within Enterobacterales, so observations in E. coli are likely generalizable to other species, including K. pneumoniae, Salmonella, and E. cloacae.111–114 However, efflux in other gram-negative pathogens is still a significant concern, particularly for critical priority pathogens P. aeruginosa and A. baumannii, both of which demonstrate significantly lower levels of permeation compared with E. coli.54,93 Mycobacterium tuberculosis is also problematic and plagued by similar efflux and influx challenges, and strategies discussed herein would apply to this pathogen. Additionally, the porin and efflux pump landscape across pathogenic gram-negative species is highly diverse, with low sequence homologies between efflux pumps, and compensatory genetic regulatory mechanisms to upregulate and express alternative efflux pumps during antibiotic stress. 11 ML algorithms could be used to compare and contrast the permeation and efflux landscapes across species and develop a comprehensive view of permeation and efflux. Utilizing accumulation, efflux, and activity data in these strains, supervised learning and clustering algorithms17,115,116 could define the necessary features for optimal accumulation in these species and ultimately lead to guidelines for medicinal chemistry campaigns to follow when designing antibiotics.
In E. coli, many of the tripartite efflux pumps responsible for antibiotic resistance have been well-studied with cryo-EM structures and robust biochemical assays.11,117,118 In P. aeruginosa and A. baumannii, these efflux pumps are not as widely studied, but crystal structures and cryo-EM structures for the main efflux pumps are available.119–124 For efflux pumps in other species, or with lower expression profiles in critical priority pathogens, computational programs such as AlphaFold can predict the structure of the proteins based on amino acid sequence, providing a general idea of the structure of the efflux pump and its homology to known efflux pumps. Utilizing structural data and known information about binding sites and the mechanism of extrusion in efflux pumps, large compound collections could be screened for in silico binding to efflux pumps to prioritize compounds with low binding and hypothesized to be poor substrates. A similar approach could be used to identify efflux pump inhibitors. To expedite analysis and increase the size of the in silico library that can be screened, Deep Docking can be utilized, which takes a small subset of the virtual library and docks using conventional methods, then trains a feedforward neural network to predict docking scores of the entire virtual collection.125,126 Although these approaches are still in their infancy and will not be fruitful in all cases, successes are emerging, such as the screening of compounds from the ZINC15 database for compounds active against a protease in severe acute respiratory syndrome coronavirus 2.127,128
Understanding basic efflux pump biology
Efflux pumps are highly promiscuous and have evolved to expel a wide range of substrates, and many efflux pumps have overlapping substrate scopes, allowing for compensatory mechanisms under antibiotic stress. While much is known about the substrate specificities of AcrAB-TolC in Enterobacterales and the pseudomonal analog MexAB-OprM, there are 34 other efflux pumps in E. coli, 24 and many in P. aeruginosa that are not clearly understood. A. baumannii has ∼25 characterized efflux pumps, 12 of which are tripartite efflux pumps, but over 100 genes exist encoding efflux transporters, demonstrating the dearth of understanding of efflux in this species. 27 As detailed above, tool strains have been developed where all 35 efflux pumps have been knocked out of E. coli and then individually reexpressed to understand the contribution of each efflux pump to bacterial health and physiology and the substrate specificity of each pump. 24 While an extensive study like this has not been performed in other gram-negative pathogens, there are multiple efflux knockout strains of the main efflux pumps of P. aeruginosa, K. pneumoniae, and A. baumannii that can be used as tool strains.129–133 Additionally, titratable expression platforms have been developed for E. coli to enable a deeper understanding of efflux. 134 Using these tool strains and assays to assess efflux (MIC, ethidium bromide efflux, intracellular accumulation, subcellular accumulation, imaging, and bio-orthogonal labeling), datasets regarding the substrate scope of efflux pumps within each species could be generated. Because of the size of these datasets and the complexity of efflux, AI and ML algorithms are poised to parse out trends in substrate specificities for individual efflux pumps and classes of efflux pumps, both within species and across species. This information would illuminate key features of basic efflux biology and empower better understanding of the role of efflux pumps, particularly the lesser studied inner membrane efflux pumps, toward compound accumulation and efflux.
Accessible and applicable platforms for drug property prediction
The ability to calculate properties from a chemical structure and predict biological activity or liabilities to prioritize compound synthesis is called QSAR (quantitative SAR) or QSPR (quantitative structure–property relationship) and is crucial to medicinal chemistry campaigns, which are costly and have high attrition rates.8,135,136 Since the development of Lipinski’s rules in 1997, 137 there have been multiple algorithms developed and optimized to accurately predict cLogP, a key measure of lipophilicity of a compound and an important parameter to optimize for oral bioavailability.138–141 These algorithms rely on computational methods to estimate the contributions of the molecular fragments toward lipophilicity; advances in chemoinformatics and chemical descriptors can improve lipophilicity calculations, particularly for boutique functional groups and novel chemical scaffolds.
Understanding and predicting pharmacokinetic properties of compounds (Absorption, Distribution, Metabolism, Excretion, and Toxicity [ADMET]) is another growing area of QSAR and QSPR, and work toward developing refined ADMET predictors rely heavily on ML and artificial intelligence.142–145 Toxicity predictions lag behind other QSAR/QSPR approaches, and current algorithms do not always identify liabilities before a lead compound is advanced. 146 ML methods, such as deep neural networks, can be used to identify trends in toxicity and liable functional groups,147,148 and has been used to predict hERG toxicity. 149
In gram-negative bacteria, compound development is further complicated by permeation and efflux challenges, in addition to target engagement and ADMET challenges. The development of platforms to predict accumulation and efflux could expedite antibiotic drug discovery, and the marriage of entry and efflux rules with filters for toxicity or drug-likeness could enable more rapid development of new antibiotics. eNTRyway (https://entryway.igb.illinois.edu/) is a free and easy-to-use web-based application to predict the accumulation of a compound in E. coli based on the eNTRy rules cutoffs43,104; this approach could be expanded to other species, or to integrate efflux predictions, or to flag metabolic liabilities/toxic functional groups.
While there is a growing collection of drug design principles, many of these algorithms rely on proprietary physicochemical descriptors or require a trained professional to utilize the algorithm. Development of open-access or democratized platforms would allow these principles to be applied widely, even by the untrained professional. Additionally, as QSAR and QSPR approaches expand, multivariable platforms that consider permeation, efflux, toxicity, ADMET properties, and drug-likeness, and weight these parameters to prioritize compounds could expedite antibiotic drug discovery. Such a platform could be made easily accessible through a web-based or downloadable application requiring only chemical structures as input, allowing for general accessibility and ease of access. Additionally, as compounds from these platforms are made and tested, these data can be cycled into the ML algorithm to improve predictive accuracy of the model and adjust the way that each variable is weighted.
Conclusions
Significant advances have been made toward understanding compound accumulation in gram-negative pathogens, and in using this information to design new antibiotics. However, a full understanding of compound efflux remains elusive, and even many antibiotics that are clinically approved for treatment of gram-negative infections have efflux liabilities. Existing ML algorithms are likely sufficient to solve the problem, but this has not happened due to the limited data on the types of compounds that are efflux substrates. Once these data are available, ML algorithms and artificial intelligence are poised to identify trends and correlations not evident to the human user. With a growing list of preclinical antibiotic candidates ripe for development, compound efflux remains one of the final frontiers to cross to expedite antibiotic discovery and development, and ML will likely be an important tool to conquer this challenge.
Footnotes
Acknowledgment
Figures 1 and
were made using Biorender.com.
Authors’ Contributions
R.J.U. and P.J.H. conceptualized the article, and R.J.U. wrote the article with assistance from P.J.H.
Disclosure Statement
R.J.U. declares no competing financial interests. The University of Illinois has filed patents on some compounds described in this work, on which P.J.H. is an inventor.
Funding Information
P.J.H. thanks the NIH-NIAID for support of work in this area (R01AI176523). R.J.U. was supported by an NIH Ruth Kirschstein Award (F31AI161953) and was a NSF predoctoral fellow.
