Abstract
Objective
This study aimed to systematically identify clinic-based predictors of poor proton pump inhibitors (PPIs) response in patients with suspected Laryngopharyngeal reflux disease (LPRD) (based on RSI >13 and RFS >7) and develop a predictive model for individualized management.
Methods
A retrospective cohort study analyzed suspected adult LPRD patients who were treated with standard-dose PPIs for ≥3 months from 2021-2024. Patients were stratified into PPI-responsive or non-responsive groups. Clinical characteristics, pretreatment Reflux Symptom Index (RSI) score, adherence to lifestyle modifications, findings from electronic laryngoscopy (specifically the presence of chronic hyperplastic/keratotic changes), and objective voice assessment via acoustic analysis (dichotomized as normal/abnormal) were evaluated. Independent predictors were identified via univariate and multivariate logistic regression and used to construct a predictive model. Model performance was assessed using ROC curves, calibration plots, and decision curve analysis in training and internal validation cohorts.
Results
A total of 301 patients were included. Multivariate analysis identified four independent predictors of poor PPI response: higher pretreatment RSI score (OR=1.180), low adherence to lifestyle modifications (<90%, OR=4.660), the presence of laryngoscopic hyperplastic/keratotic changes (OR=5.440), and abnormal pretreatment voice acoustic parameters (OR=2.755). The predictive model incorporating these variables demonstrated excellent discriminative ability, with an area under the ROC curve (AUC) of 0.856 (95% CI: 0.805-0.907) in the training set and 0.891 (95% CI: 0.826-0.955) in the validation set. The model also showed good calibration and provided positive net benefit across a range of clinical decision thresholds.
Conclusion
The developed predictive model, based on readily accessible clinical, laryngoscopic, and vocal functional variables, effectively identifies suspected LPRD patients at high risk of poor PPI response. These findings are hypothesis-generating and lay the groundwork for future prospective studies to validate the model’s utility and to determine whether it can facilitate personalized management strategies in clinical practice.
Keywords
1. Introduction
Laryngopharyngeal reflux disease (LPRD) was first reported in the literature in 1968, 1 and it was not until 2002 that the American Academy of Otolaryngology-Head and Neck Surgery formally introduced and applied the term to describe the pathological retrograde movement of gastric contents into the larynx and hypopharynx. 2 Currently, LPRD is defined as the reflux of gastroduodenal contents above the upper esophageal sphincter into the upper aerodigestive tract, including the nasopharynx, oropharynx, laryngopharynx, and larynx, which can cause morphological changes and a series of symptoms and signs. Although research on LPRD has spanned decades, its etiology remains unclear. Some scholars attribute it to transient lower esophageal sphincter relaxation, 3 while others suggest it may be due to acidic secretions from heterotopic gastric mucosa in the cervical esophagus. 4 The most common symptoms of LPRD include throat discomfort, foreign body sensation, chronic throat clearing, voice disorders, hoarseness, and chronic cough. 5 As early as 1990, Koufman et al 6 reported that approximately 10% of patients visiting otolaryngology clinics tested positive for LPRD. Using reflux symptom scales as screening tools, more recent studies have reported LPRD positivity rates ranging from 10.15% to 18.8% in various populations,7-9 indicating its status as a prevalent chronic condition.
Current management strategies for LPRD include proton pump inhibitors (PPIs), lifestyle modifications, and dietary adjustments. Standard-dose PPIs are the first-line pharmacological treatment, reducing gastric acid secretion by inhibiting the H+/K+-ATPase in gastric parietal cells, thereby decreasing the acidity of refluxate and alleviating symptoms such as cough and throat irritation. 10 However, their efficacy remains debated. Some studies demonstrate significant improvement in symptoms and signs with PPI therapy,11-14 while others show no clear advantage over placebo, highlighting a significant placebo effect.15,16 Moreover, clinical practice reveals that even with guideline-adherent treatment, 30%-40% of LPRD patients exhibit poor response to PPIs, manifesting as no improvement, partial relief, or worsening of symptoms. These patients often require adjusted treatment strategies, increasing healthcare burdens and potentially reducing treatment adherence.
The issue of poor PPI response has thus become a critical bottleneck in LPRD management, yet the factors that reliably predict it remain poorly defined. While some studies have begun to explore potential predictors—such as baseline symptom severity, the presence of pepsin in saliva, or specific laryngoscopic findings—the evidence is fragmented and often inconsistent.17-20 Many of these studies are limited by small sample sizes, a narrow focus on single variables, or a failure to integrate multidimensional patient data. Consequently, no consensus exists on a robust, clinically applicable set of predictors that can be used at the point of care to identify patients at high risk of treatment failure. This lack of a predictive framework means that most patients are subjected to a lengthy and often ineffective empirical trial of PPI therapy, delaying the implementation of more effective, individualized management strategies.
To address this clinical gap, we conducted a retrospective cohort study of suspected adult LPRD patients treated with standard-dose PPIs for at least 3 months. Our primary objective was to systematically evaluate a comprehensive set of readily accessible clinical, behavioral, laryngoscopic, and vocal function variables to identify independent core predictors of poor PPI response. We then aimed to integrate these predictors into a reliable, multi-dimensional predictive model. Such a tool would assist clinicians in the early identification of high-risk patients, facilitate individualized treatment planning, optimize clinical workflows, and ultimately improve therapeutic outcomes.
2. Methods
2.1. Study Population
This single-center retrospective cohort study was approved by the institutional ethics review board (Approval No. AF/SC-08/03.2). The study population comprised suspected adult LPRD patients who visited the voice and laryngology outpatient clinic and received standard treatment and follow-up between January 2021 and December 2024.
Inclusion criteria were: (1) age 19-75 years; (2) typical LPRD symptoms, including frequent throat clearing, hoarseness, globus sensation, and irritative cough; (3) PPI treatment for at least 3 months; (4) Clinical suspicion of LPRD was established based on commonly used screening criteria combining symptoms and laryngoscopic findings: a pre-treatment Reflux Symptom Index (RSI) score >13 and a Reflux Finding Score (RFS) >7, in the absence of objective reflux testing (e.g., MII-pH monitoring). This dual-criteria approach is consistent with the diagnostic guidelines established by Belafsky et al21,22 and is widely adopted in clinical practice to ensure diagnostic accuracy; (5) availability of complete baseline clinical data, recordings and reports from pretreatment electronic laryngoscopy, and pretreatment voice acoustic analysis data.
Exclusion criteria were: (1) previous upper gastrointestinal or laryngopharyngeal surgery; (2) history of head and neck, esophageal, or gastric malignancy or radiotherapy; (3) contraindications to PPIs or concurrent use of other acid-affecting medications during the study; (4) severe cardiac, hepatic, renal dysfunction, or malignancy; (5) incomplete clinical data precluding evaluation.
The patient selection flowchart is shown in Figure 1. Patient screening process
2.2. Treatment Protocol and Grouping
All enrolled patients received standard LPRD management: Pharmacotherapy: oral standard-dose PPI twice daily, 30-60 minutes before meals, to maximize acid suppression, for at least 3 months. Lifestyle intervention: standardized dietary and behavioral advice, including avoiding high-fat, spicy, acidic foods, coffee, chocolate, and alcohol; no food intake 3 hours before bedtime; smoking cessation; head-of-bed elevation; weight control.
Based on treatment response after 3 months, patients were divided into: PPI-responsive group: post-treatment RSI reduction ≥50% from baseline (including a 50% reduction exactly). PPI-nonresponsive group: post-treatment RSI reduction <50% from baseline.
2.3. Outcome Measures and Data Collection
The following data were collected via electronic medical records and dedicated case report forms: (1) Baseline clinical data: age, sex, body mass index (BMI), smoking history (defined as ≥1 cigarette/day for ≥1 year), alcohol consumption (≥1 alcoholic drink/week), caffeine intake (≥1 cup of coffee or strong tea/day). (2) Symptom assessment: The RSI was used to assess symptom burden pre- and post-treatment. The RFS was utilized as part of the diagnostic inclusion criteria to confirm the presence of laryngoscopic signs consistent with LPRD (RSI >13 and/or RFS >7). (3) Behavioral adherence assessment: Adherence to lifestyle and dietary recommendations was evaluated through review of follow-up diaries or outpatient records. Compliance rate was calculated as (actual days followed/total recommended days). Based on prior literature,
18
adherence rate ≥90% was defined as “high adherence”; <90% as “low adherence.” (4) Laryngoscopic assessment: In addition to using the RFS for diagnosis, pre-treatment electronic laryngoscopy recordings were specifically re-reviewed by otolaryngologists to evaluate the presence or absence of a distinct morphological feature: chronic hyperplastic or keratotic changes. This specific finding was selected as a candidate predictor based on literature suggesting that such lesions represent a state of chronic tissue remodeling (e.g., epithelial hyperplasia and keratinization) in response to persistent reflux injury.23,24 Unlike the acute or inflammatory signs captured by the RFS, these structural changes may indicate a more advanced or treatment-resistant phenotype. Positive findings were defined as the presence of vocal fold leukoplakia or thick white plaques, significant interarytenoid mucosal hyperplasia with keratosis, or diffuse polypoid degeneration (Reinke’s edema) indicative of chronic change. For analysis, this finding was dichotomized as “Present” or “Absent.” (5) Voice assessment: Pre-treatment voice samples were obtained via sustained phonation of the vowel/a/. Acoustic analysis was performed using Praat Software to extract parameters including jitter (%), shimmer (%), and noise-to-harmonic ratio (NHR). Voice quality was dichotomized as “Abnormal” if either jitter > 1.04% or shimmer > 3.81%, or “Normal” otherwise.
25
2.4. Statistical Analysis
Statistical analyses were performed using SPSS 26.0 and R 4.2.1. Descriptive statistics were reported as mean±standard deviation or frequency (percentage). Group comparisons used independent t-tests or chi-square tests. Variables with P<0.05 in univariate analysis were entered into multivariate logistic regression to identify independent predictors. A predictive model was constructed based on these predictors. Model performance was assessed using ROC analysis to calculate AUC, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). Calibration was evaluated with the Hosmer-Lemeshow test and calibration plots. Decision curve analysis (DCA) was performed to evaluate clinical utility. SHAP (Shapley Additive Explanations) analysis was performed in R to interpret predictor contributions and visualize variable importance based on mean absolute SHAP values. The dataset was split into training and internal validation sets (approximately 70:30 ratio) to develop and validate the model, respectively. To ensure the randomness of the grouping, we employed a computer-generated random sequence using the “sample” function in R 4.2.1 to allocate patients into the training and validation sets. Regarding the sample size, no formal a priori power calculation was performed for this retrospective exploratory study. However, the sample size was determined based on the events-per-variable (EPV) rule for logistic regression analysis. With 76 events (PPI-nonresponsive patients) in the training set and four candidate predictors identified from the multivariate analysis, the EPV was 19. This exceeds the minimum recommended EPV of 10, which is widely considered sufficient to ensure the reliability of the regression coefficients and to avoid model overfitting.26,27
3. Results
3.1. Baseline Characteristics
Comparison of Patient Baseline Characteristics in Training Set
Note. t: t-test, χ2: Chi-square test.
Comparison of Patient Baseline Characteristics in Validation Set
3.2. Comparison of Clinical Indicators Between Groups
Comparison of Clinical Indicators Between Groups in Training Set
Comparison of Clinical Indicators Between Groups in Validation Set
3.3. Univariate and Multivariate Logistic Regression Analyses for Factors Influencing PPI Treatment Response
Univariate Logistic Regression Analysis
Note. β: Regression Coefficient; S.E.: Standard Error; OR: Odds Ratio; CI: Confidence Interval.
Multivariate Logistic Regression Analysis
Note. β: Regression Coefficient; S.E.: Standard Error; OR: Odds Ratio; CI: Confidence Interval.
3.4. Development and Performance Evaluation of the Predictive Model for PPI Nonresponse
A predictive model was constructed using the four independent predictors. In the training set, the model demonstrated excellent discriminative ability with an AUC of 0.856 (95% CI: 0.805-0.907). At the optimal cutoff, sensitivity was 0.934 (95% CI: 0.878-0.990), specificity 0.679 (0.600-0.758), PPV 0.623 (0.534-0.712), and NPV 0.948 (0.903-0.992). The Hosmer-Lemeshow test indicated good calibration (P>0.05), and DCA showed a positive net benefit across a range of threshold probabilities.
Confusion Matrix

Predictive model for PPI-nonresponsive, including ROC curve, calibration curve, and DCA curve. (Training set)

Predictive model for PPI-nonresponsive, including ROC curve, calibration curve, and DCA curve. (Validation set)
3.5. SHAP-Based Interpretability of the Predictive Model for PPI Nonresponse in Suspected LPRD Patients
As shown in Figure 4, “Lifestyle compliance <90%” emerged as the most influential predictor, followed by “RSI pretreatment”, “Presence of laryngoscopic hyperplastic/keratotic changes”, and “Abnormal pretreatment voice acoustic parameters”. These results align with the multivariate logistic regression findings and underscore the predominant role of behavioral adherence and baseline symptom severity in determining PPI nonresponse, while also highlighting the added value of biomarker status and structural comorbidity in refining predictive accuracy. SHAP-based interpretability of the optimal predictive model. (A) Mean SHAP values rank predictor importance. (B) Relative contribution of each variable to the model outcome. SHAP, shapley additive explanation
4. Discussion
The management of LPRD remains a clinical challenge, marked by variable patient responses to first-line PPI therapy. While numerous factors have been postulated to influence therapeutic outcomes, a consensus on robust and actionable predictors of PPI failure is still lacking. The primary objective of this study was to identify independent, clinic-based predictors for poor therapeutic response to standard-dose PPI therapy in patients with LPRD and to develop a clinically applicable predictive model. Our findings revealed that a constellation of four pre-treatment factors—severe baseline symptom burden, suboptimal adherence to lifestyle modifications, the presence of chronic hyperplastic/keratotic findings on laryngoscopy, and objective voice abnormalities on acoustic analysis—collectively and independently forecast a diminished likelihood of symptomatic improvement following a three-month PPI regimen. These results illuminate the multifaceted nature of LPRD and underscore the imperative for a stratified management approach that moves beyond empirical PPI trials by incorporating easily accessible clinical, morphological, and functional assessments.
The strong independent association between a higher pre-treatment RSI score and PPI nonresponse is a pivotal finding. The RSI, a patient-reported outcome measure, quantifies the subjective burden of laryngopharyngeal symptoms. 22 A markedly elevated score may be a clinical marker of a more severe or chronic disease phenotype that is less responsive to acid suppression alone. 28 It is plausible that in these patients, non-acidic reflux components (e.g., pepsin or bile acids) could play a more prominent role in symptom generation, though our study cannot confirm this as we did not perform pH-impedance monitoring or pepsin detection.29,30 Regardless of the exact mechanism, the strong association between baseline symptom severity and poor response identifies this group as high-risk. Therefore, patients presenting with severe symptomatology may represent a distinct subgroup where the primary driver of symptoms extends beyond acid-mediated injury, rendering acid suppression insufficient as a sole therapeutic strategy. This is further supported by Boom et al, 31 whose prospective study in patients with globus pharyngeus confirmed that a higher baseline RSI (>13) was significantly associated with poorer response to empirical PPI therapy. Furthermore, our finding aligns with prior research demonstrating that a higher pre-treatment RSI score is associated with PPI nonresponse, as reported by Yadlapati et al, 17 further supporting the role of initial symptom severity in predicting therapeutic refractoriness.
The profound impact of behavioral adherence on treatment outcomes cannot be overstated. Our operational definition of low adherence, compliance with lifestyle interventions <90%, emerged as a powerful independent predictor of PPI failure. This finding is corroborated by a prospective cohort study by Yun et al, 18 which demonstrated that high compliance with lifestyle modifications was a significant independent predictor of good treatment response in LPRD patients, further underscoring the critical role of patient adherence in determining therapeutic outcomes. This reinforces the foundational principle that LPRD management is inherently multimodal. Lifestyle modifications, including dietary adjustments, weight management, and avoiding recumbency after meals, aim to reduce the total reflux burden by decreasing abdominal pressure, improving lower esophageal sphincter function, and minimizing triggering factors. When patients inadequately adhere to these measures, the ongoing mechanical and chemical insult from refluxate may simply overwhelm the pharmacological effect of acid suppression. 32 A PPI cannot compensate for frequent supine positioning after a large, high-fat meal. This finding shifts a portion of the responsibility for therapeutic failure from drug efficacy to implementation science and patient engagement. It emphasizes that educating patients on the necessity of comprehensive lifestyle change is not ancillary but central to successful treatment. Strategies to enhance adherence, such as structured educational programs, motivational interviewing, and digital health reminders, could be integral to improving overall outcomes and should be considered a core component of LPRD management protocols. This is further corroborated by Chappity et al, 33 whose study highlighted the critical role of sustained lifestyle modification, showing significantly higher recurrence rates in patients who discontinued lifestyle adjustments after PPI therapy. Similarly, a comprehensive 2025 review by Bucan et al 34 reiterates that dietary and lifestyle interventions are essential components of a multidisciplinary framework for sustainable GERD management. These findings collectively underscore the independent importance of lifestyle adherence in achieving and maintaining treatment success.
Perhaps the novel predictor identified was the pretreatment presence of chronic hyperplastic or keratotic lesions on laryngoscopy (e.g., vocal fold leukoplakia, thick white plaques, interarytenoid hyperplasia). This finding provides a crucial morphological correlate for PPI resistance. These ‘white lesions’ likely reflect a state of chronic tissue alteration, which may be less reversible with the acid suppression provided by a standard 3-month PPI course.19,20 The presence of such lesions at baseline may therefore define a distinct clinical phenotype characterized by advanced mucosal changes, where expectations for rapid and complete symptomatic improvement with PPI monotherapy should be tempered. 35 The presence of such lesions likely defines a distinct “tissue-remodeling” phenotype of LPRD, characterized by advanced mucosal injury where acid suppression alone is insufficient. This necessitates a more aggressive initial strategy, potentially including higher-dose PPI, mucosal protectants, and closer surveillance, moving beyond the standard empirical trial.
Similarly, the inclusion of objective voice acoustic abnormality as a predictor adds a critical functional dimension to the model. Abnormal jitter and shimmer are quantifiable markers of disrupted vocal fold vibration, which can result from LPRD-related mucosal changes such as edema or lesions.36,37 The objective presence of such voice abnormalities at baseline indicates a significant functional impact of the disease and may identify patients whose recovery requires more than just acid suppression, potentially necessitating adjunctive voice therapy. 38 While PPI may alleviate some contributing mucosal edema, it does not directly address neuromuscular components or compensatory maladaptive vocal habits that may have developed. 39 Therefore, patients with objective dysphonia at presentation are likely to experience slower and less complete symptomatic recovery with PPI monotherapy, often requiring adjunctive voice therapy for optimal outcomes. This factor helps identify patients whose treatment needs extend beyond acid suppression.
The predictive model integrating these four variables demonstrated robust discriminative performance. While this suggests its potential as a clinical tool, these findings are preliminary and require prospective validation. The model’s ability to accurately identify high-risk patients in a research setting does not automatically translate to improved outcomes in clinical practice. If confirmed in future studies, this model could eventually serve as a basis for clinical decision-making. However, at this stage, our results are hypothesis-generating. They suggest that patients characterized by severe symptoms, evidence of chronic tissue change, and objective voice impairment may represent a high-risk phenotype. Whether this phenotype would benefit from alternative or augmented strategies—such as intensified PPI dosing, adjunctive therapies, or earlier intervention—must be tested in future randomized controlled trials that compare standard care against model-guided treatment pathways. This study is not without limitations. Its retrospective, single-center design may limit the generalizability of the findings. The assessment of lifestyle adherence, though based on clinical records, is inherently subjective. Furthermore, the definitions and inter-rater reliability for categorizing laryngoscopic “hyperplastic/keratotic” findings require standardization in future studies. Importantly, the diagnosis of LPRD in this study was based on symptom (RSI) and laryngoscopic (RFS) criteria rather than the more objective 24-hour multichannel intraluminal impedance-pH (MII-pH) monitoring, which is considered a reference standard for detecting both acid and non-acid reflux events. The absence of MII-pH monitoring limits our ability to definitively confirm reflux as the underlying cause of symptoms. Consequently, some patients classified as PPI-nonresponsive may not have true reflux-related disease but rather alternative etiologies such as functional laryngeal symptoms, chronic laryngitis of other causes, or upper airway conditions. This potential diagnostic misclassification should be considered when interpreting the predictive model. Future research should aim to prospectively validate this predictive model in diverse populations using MII-pH monitoring and investigate the optimal management pathways for the high-risk phenotype it defines. Randomized controlled trials comparing standard PPI therapy versus intensified multi-modal treatment in patients stratified by this model are needed.
5 Conclusion
In conclusion, this study identifies a profile of the suspected LPRD patient at high risk for PPI treatment failure, characterized by a severe baseline symptom burden, suboptimal adherence to lifestyle modifications, the presence of chronic hyperplastic or keratotic lesions on laryngoscopy, and objective evidence of voice impairment. These factors point to a complex disease phenotype where pathophysiological mechanisms may extend beyond acid-mediated injury.
The predictive model developed from these variables demonstrates excellent discriminative ability for identifying poor PPI responders. However, these findings should be considered hypothesis-generating. The model provides a strong empirical foundation for future research, but it is not yet ready for clinical implementation. Prospective studies are essential to externally validate the model in diverse populations and to investigate whether its use can guide treatment decisions and ultimately improve patient outcomes compared to current clinical practice. Such research will determine if this model can facilitate a shift towards a more stratified management approach for patients with LPRD.
Footnotes
Ethical Considerations
The animal study was reviewed and approved by the The First Affiliated Hospital of Henan University of Chinese Medicine Ethics Committee(Ethics number No: AF/SC-08/03.2) and was accordance with Declaration of Helsinki and its later amendments.
Author Contributions
Bing Wang: Conceptualization, Methodology, Writing-Original draft preparation. Xiangdong Guo: Data curation, Writing-Original draft preparation, Software, Validation, Investigation. Xiangsheng Mei: Visualization, Visualization, Correspondence.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported by Special Research Project on Traditional Chinese Medicine in Henan Province (NO. 2023ZY2017, NO. 2019ZY1005).
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article
Data Availability Statement
The data are available upon reasonable request.
