Abstract
Aim
This study aims to investigate the cross-sectional associations of caffeine-related metabolites with prevalent osteoarthritis (OA).
Methods
The study comprised 4583 US adults from 2009–2014 National Health and Nutrition Examination Survey, who had data on urinary caffeine-related metabolites and responded to the questions, “doctor ever said you had arthritis?” and “which type of arthritis was it?” Logistic regressions were used to examine the association between each metabolite and OA prevalence after considering the complex design and adjusting for age, sex, race, body mass index (BMI), household income, smoking status, and alcohol consumption. To assess the joint association of individual metabolites, a composite metabolomic score was constructed from principal component analysis.
Results
Among the study sample, 653 adults were reported to have OA. The OA cases were more likely to be older, female, non-Hispanic White, never smokers, and have higher BMI. The majority of caffeine-related metabolites showed positive associations with prevalent OA in unadjusted models, but these associations were attenuated to null in multivariable-adjusted models. In further exploratory stratified analyses, higher levels of 3-methyluric acid, 7-methyluric acid, 3,7-dimethyluric acid, 3-methylxanthine, and 7-methylxanthine were associated with a higher prevalence of OA in women, but not in men. Similarly, a significant positive association was observed between the composite metabolomic score and OA in women but not in men.
Conclusion
In this cross-sectional study, certain caffeine-related metabolites were positively associated with the prevalence of OA in women. Well-designed cohort studies are needed to investigate the association of caffeine consumption and related metabolites with OA risk.
Introduction
Osteoarthritis (OA) is the most common type of arthritis (Sen and Hurley, 2023; Osteoarthritis, 2024). About 33 million US adults aged 18 and above have OA (Osteoarthritis, 2024), and it is a common leading cause of disability (Hootman et al., 2016). Adults with OA have chronic joint pain, stiffness, and swelling in the knee, hands, hips, and back (Sen and Hurley, 2023). The burden of OA and related complications will increase with the growing and aging US population (Hootman et al., 2016). People with OA experience joint pain and structural damage, leading to limitations to their daily activities and restrictions in work and social participation (Fallon et al., 2023; Hootman et al., 2016). Additionally, individuals with OA often have coexisting chronic diseases (Fallon et al., 2023), including anxiety and depression (Guglielmo et al., 2018).
Caffeine (1,3,7-trimethylxanthine) is a naturally occurring alkaloid found in coffee beans, tea leaves, and other fruits (Gracia-Lor et al., 2017). Caffeine consumption is common with beverages such as coffee, tea, soft drinks, and energy drinks (Gracia-Lor et al., 2017). It is also present in cocoa, chocolate, analgesic medication formulations, and dietary supplements (Gracia-Lor et al., 2017). The consumed caffeine is metabolized in the human liver, where it forms three major primary metabolites by demethylation process: theobromine (3,7-dimethylxanthine), paraxanthine (1,7-dimethylxanthine), and theophylline (1,3-dimethylxanthine) (Gracia-Lor et al., 2017; Thorn et al., 2012). Later, these major primary metabolites are further broken down in the liver by additional demethylation and oxidation processes, forming the secondary metabolites, such as 1-methyluric acid, 3-methyluric acid, 7-methyluric acid, 1,3-dimethyluric acid, 1,7-dimethyluric acid, 3,7-dimethyluric acid, 1,3,7-trimethyluric acid, 1-methylxanthine, 3-methylxanthine, 7-methylxanthine, and 5-acetylamino-6-amino-3-methyluracil (Gracia-Lor et al., 2017; Thorn et al., 2012). Finally, the primary and secondary metabolites are excreted in the urine (Gracia-Lor et al., 2017; Thorn et al., 2012). The half-life of caffeine ranges between 4 and 5 hours and is prolonged among (up to 100 hours) people with liver diseases, children, and pregnant women. Also, smoking reduces the half-life, as it activates the liver enzymes involved in caffeine metabolism (Thorn et al., 2012).
Animal and in vitro studies suggest that caffeine may impair cartilage health by suppressing chondrocyte proliferation, reducing key extracellular matrix components, and disrupting anabolic pathways, ultimately weakening cartilage structure and repair capacity and promoting degeneration (Guillán-Fresco et al., 2020). Caffeine has also been shown to alter chondrocyte metabolism and increase cholesterol accumulation, further compromising cartilage quality. Consistent with these mechanisms, observational studies report a positive association between caffeine intake and OA, although they rely on self-reported measures that may introduce bias (Bang et al., 2019; Zhang et al., 2021). To address this limitation, the present study uses urinary caffeine metabolites as objective biomarkers to examine their association with OA prevalence.
Materials and methods
Study design and population
The National Health and Nutrition Examination Survey (NHANES) was a continuous annual survey designed to represent the noninstitutionalized US population (NHANES questionnaires, datasets and related documentation, n.d.). The NHANES data were collected from all 50 states, the District of Columbia, and US territories. Active-duty military personnel and US residents living overseas were excluded from the NHANES survey (National Health and Nutrition Examination Survey (NHANES) | CMS, n.d.). Approximately 5000 nationally represented participants were involved every year, and the two-year data account for a wave (National Health and Nutrition Examination Survey (NHANES) | CMS, n.d.). The NHANES uses a computerized program to randomly select the sampling unit, which includes county, neighborhood, households, and the individual of the household (National Health and Nutrition Examination Survey (NHANES) | CMS, n.d.). The extensive information was collected through interviews, physical examination, and laboratory testing. The detailed NHANES study design and data are publicly available at https://www.cdc.gov/nchs/nhanes. Briefly, the interview process included data collection on demographic, socioeconomic, dietary, and health-related information. The physical examination included medical, dental, and physiological measurements (height, weight, blood pressure, etc.). The laboratory test involves blood and urine tests, which were administered by well-trained professionals (NHANES questionnaires, datasets and related documentation, n.d.). The NHANES team obtained IRB approval and informed consent from the participants (Ethics Review Board approval, 2024).
In this study, we initially included 8095 participants with available urine caffeine-related metabolite data from 2009 to 2014. Participants under 20 years of age and those with self-reported arthritis types other than OA (e.g., rheumatoid arthritis, gout arthritis, and psoriatic arthritis) were excluded. Ultimately, 4583 participants were included in the analysis (Figure 1).

Analytical study sample selection from the participants with urine caffeine-metabolite data from the National Health and Nutrition Examination Survey (NHANES) 2009–2014.
Prevalent OA cases
Participants who responded “Yes” to both “Doctor ever said you had arthritis” and specified “osteoarthritis or degenerative arthritis” were classified as OA cases. Participants who responded “No” to both “Doctor ever said you had arthritis” are recorded as non-OA cases. Individuals reporting other types of arthritis (such as rheumatoid arthritis or psoriatic arthritis) were excluded from the analysis.
Urine caffeine-related metabolites
Using an high-performance liquid chromatography (HPLC) coupled with tandem mass spectrometry, caffeine (1,3,7-trimethylxanthine) and 14 of its metabolites, including 1-methyluric acid, 3-methyluric acid, 7-methyluric acid, 1,3-dimethyluric acid, 1,7-dimethyluric acid, 3,7-dimethyluric acid, 1,3,7-trimethyluric acid, 1-methylxanthine, 3-methylxanthine, 7-methylxanthine, 1,3-dimethylxanthine (theophylline), 1,7-dimethylxanthine (paraxanthine), 3,7-dimethylxanthine (theobromine), 5-acetylamino-6-amino-3-methyluracil, were measured in urine. Detailed urine samples collection and processing instructions are presented in the NHANES Laboratory/Medical Technologists Procedures Manual (Nutritional Biomarkers Branch [NBB], Division of Laboratory Sciences [DLS], and National Center for Environmental Health [NCEH], 2009b).
Study covariates
We collected variables from the NHANES database, including age, sex, race, educational level, smoking status, alcohol consumption, height, weight, and body mass index (BMI). Our data preprocessing approach followed a systematic process to ensure data quality, consistency, and suitability for epidemiological research. Participants were carefully categorized, and demographic variables were simplified to align with standard practices in epidemiological studies. Education, race, marital status, annual income, and BMI were consolidated into meaningful categories to ensure statistically robust representation. Since smoking status and alcohol consumption are related to coffee or caffeine intake and metabolism, these factors will be considered in the analysis. Given the low proportion of missing values across most covariates, we employed simple imputation strategies. Missing BMI values were imputed using the sex-specific sample mean, while missing values for categorical covariates, such as education and marital status, were imputed using the mode (the most frequent category).
Statistical analysis
In this study, we included NHANES data from 2009–2014, as the urine caffeine and its metabolites laboratory tests were only performed during these three waves. The data files of these three waves were imported, combined, and analyzed in R Statistical Software (version 4.1.2, R Core Team 2021). Complex multistage and probability cluster sampling design was considered in the descriptive and analytical statistical analyses; therefore, the results from this study can be generalized to US adults. Urine caffeine-related metabolites were categorized into quartiles. We investigated the associations between each metabolite and the OA prevalence using weighted logistic regression models. Age, sex, race, BMI, household income, marital status, education, smoking status, and alcohol consumption were adjusted in the model. We combined the individual urine caffeine metabolites and constructed a composite metabolomic score from the principal component analysis (PCA).
Individual metabolite analysis
Metabolite measurements were transformed into ordinal data using quartile-based stratification, and separate weighted logistic regression models were employed to examine the relationship between each metabolite and prevalence of OA. The initial analysis incorporated a comprehensive set of potential covariates. Backward model section was used to develop parsimonious models. To compute p-trends for each metabolite, we utilized a scaled continuous metabolite variable using an advanced Variance Stabilizing Normalization (VSN) approach, addressing potential measurement variability (Huber et al., 2002).
A composite metabolomic score
To assess the association of overall caffeine-related metabolites with OA, we developed a combined metabolomic score using PCA. First, we applied the VSN approach to stabilize the variance of the metabolomic data. The PCA accounts for the complex intercorrelations among 15 metabolites, significantly reduces the number of statistical tests, and helps prevent false-positive findings. Briefly, each principal component represents a linear combination of all metabolites, weighted by their factor loadings, and is designed to explain as much interindividual variation in the metabolites as possible. Since the first principal component explained more than 72% of the total variance, the metabolomic score was derived based on the first component. This score was subsequently evaluated as a predictor of OA using logistic regression models, as described previously.
Sex-stratified analysis
To investigate potential sex-specific metabolite associations with OA, we conducted a sex-stratified analysis. Separate logistic regression models were applied for men and women to analyze the associations of both individual metabolites and the overall metabolomic score with the prevalence of OA. By examining unadjusted and multivariable-adjusted associations, we explored potential differences in metabolite–OA relationships across sex, accounting for nuanced biological and physiological variations.
Results
Among the 4583 participants in our analytical sample, 653 participants were reported to have doctor-diagnosed OA, and 3930 participants were non-OA cases. The participants with OA were more likely to be older, female, non-Hispanic White, never smokers, and have higher BMI (Table 1). Table 2 presents the results of unadjusted and multivariable-adjusted analyses examining the association between individual urine caffeine-related metabolites and OA prevalence, while accounting for the complex sampling design. In the unadjusted models, 11 out of 15 caffeine-related metabolites demonstrated significantly positive associations with prevalent OA, showing a clear linear trend. After adjusting for age, sex, race, smoking status, and BMI, the associations attenuated. These results were similar after additional adjustment for education, household income, marital status, and alcohol intake.
Baseline characteristics of analytical sample with urine caffeine-related metabolite data from the NHANES 2009–2014.
Considering complex survey sampling design.
Odds ratios (ORs) and 95% confident intervals (CIs) of prevalent osteoarthritis according to the quartiles of caffeine-related metabolites.a
Considering the complex sampling design.
Adjusting for age, sex, race, smoking status, and BMI.
BMI: body mass index.
To assess potential effect modification by sex, we conducted sex-specific analyses. The associations between caffeine-related metabolites and OA prevalence differed by sex (p for interaction < 0.050). In men, no significant associations were observed between any metabolites and prevalent OA. In contrast, among women, 3-methyluric acid, 7-methyluric acid, 3,7-dimethyluric acid, 3-methylxanthine, and 7-methylxanthine showed significant linear relationships with OA prevalence in multivariable-adjusted models. These results were similar after additional adjusting for education, household income, marital status, and alcohol intake (Table 3).
Odds ratios (ORs) and 95% confident intervals (CIs) of prevalent osteoarthritis according to the quartiles (Q4 versus Q1) of caffeine-related metabolites, stratified by sex.a
Adjusting for age, race, smoking status, and BMI, and considering the complex sampling design.
BMI: body mass index.
We analyzed the overall metabolomic score in relation to OA prevalence. In unadjusted analyses, positive associations with a linear trend between the metabolomic score and OA were observed in men, women, and the entire sample. However, among women, a significant association persisted for the top quartile of the metabolomic score compared to the bottom quartile (OR = 2.00, 95% CI: 1.42–2.81, p < 0.001). After adjusting for covariates, the associations attenuated. These results were similar after additional adjusting for education, household income, marital status, and alcohol intake (Table 4).
Odds ratios (ORs) and 95% confident intervals (CIs) of prevalent osteoarthritis according to the quartiles of the composite metabolomic score.a
Considering the complex sampling design.
Adjusting for age, race, smoking status, and BMI.
BMI: body mass index; NHANES: National Health and Nutrition Examination Survey.
Discussion
Based on available scientific evidence, this is the first large population study to examine the associations between urine caffeine metabolites and OA prevalence. Our research findings show that certain urine caffeine-related metabolites and the overall metabolomic score of caffeine were positively associated with the prevalence of OA in US adults, especially among women. Urine caffeine-related metabolites are the objective measurement of self-reported caffeine consumption, usually recorded as 24-h recall data. The NHANES data collection used the HPLC method to measure urine caffeine-related metabolites. This method is rapid, sensitive, accurate, and precise in measuring urine caffeine metabolites and corresponds to dietary caffeine intake (Rybak et al., 2013).
Several animal studies have demonstrated potential effects of caffeine on the musculoskeletal system, which includes articular and growth plate cartilages, which are the protective covering of the articular surfaces of synovial joints (Barone et al., 1993; Luo et al., 2015; Reis et al., 2018; Shangguan et al., 2017; Tan et al., 2012; Tan et al., 2018). The prenatal exposure to caffeine below the clinical dose of intoxication impacts the fetal articular and growth plate cartilages (Barone et al., 1993; Luo et al., 2015; Reis et al., 2018; Shangguan et al., 2017; Tan et al., 2012; Tan et al., 2018). Experimental evidence suggests that caffeine may influence cartilage metabolism and inflammation, but human data are limited. A cross-sectional study showed that daily consumption of more than 7 cups of coffee was associated with OA among Korean men (Bang et al., 2019). The association of OA was linear with increasing cups of coffee consumption, although the statistical significance was not reached below 7 cups of coffee. A recent Mendelian randomization study with a self-reported coffee intake has shown that the genetically predicted 1% increase in coffee consumption was associated with an increased risk of overall OA and knee OA, but not hip OA (Zhang et al., 2021). While prior studies have assessed self-reported caffeine intake, such measures may misclassify exposure due to recall bias and interindividual differences in metabolism. Urinary caffeine-related metabolites provide an integrated biomarker of both intake and metabolic processing, allowing for a more biologically informed evaluation of associations with OA prevalence.
The underlying mechanism is not completely known. A genomic study has shown that variations in specific genes can modify potential biological impacts on caffeine metabolism and health outcomes (Cornelis et al., 2016). Catabolic effects of caffeine affect the articular and growth cartilage, leading to long bone growth inhibition, OA, and increased risk of osteoporosis, leading to fractures (Guillán-Fresco et al., 2020). A large genomic population study had shown a positive correlation between caffeine exposures above 95 mg/day with OA prevalence (Ji et al., 2024). Caffeine concentrations ranging from 1 to 100 μM decrease the mRNA expression in the critical extracellular matrix that surrounds articular cartilage cells (Choi et al., 2016). Also, caffeine consumption reduces the mRNA expression of various steps in the insulin growth factor 1 signaling pathway, which plays an important role in promoting anabolic responses in chondrocytes (Patil et al., 2011). Additionally, caffeine consumption reduces chondrocyte proliferation, which leads to joint surface irregularities in the superficial zone of the cartilage (Choi et al., 2016).
Osteoarthritis exhibited a higher likelihood among women (Ji et al., 2024), and this could be attributed to hormonal, anatomical, and biochemical factors (Peshkova et al., 2022). Low estrogen levels, commonly seen among postmenopausal women, cause loss of muscle mass and bone (Peshkova et al., 2022). Muscle loss leads to joint instability and uneven anatomical loading, which ultimately leads to cartilage damage (Sipilä, 2003). The bone loss is due to osteoclast activation and an imbalance in calcium metabolism and homeostasis (Weitzmann, 2006). Although OA is characterized by increased bone density rather than bone mass deficiency, increased bone density contributes to OA progression and bone mass deficiency triggers the onset of OA (Burr and Gallant, 2012). The female anatomy has a lower limb varus deformity, which increases the mechanical load on the medial compartment, which contributes to cartilage loss (Tummala et al., 2016). Also, women are predisposed to greater mechanical work in the joints when compared to male counterparts (Kerrigan et al., 1998; Sims et al., 2009). Additionally, the biochemical factor involving extracellular matrix-related cartilage degeneration is predominantly evidenced among women (Li and Zheng, 2021).
This study has several strengths. It includes a relatively large sample with existing metabolomics data, and the results of this study can be generalized to US adults, as we have used appropriate sample weights in our statistical analysis. Using measured urine caffeine metabolites in our analysis overcomes the challenge of the subjective and self-reported nature of dietary recall on caffeine consumption. Despite its strengths, we also acknowledge the limitations of this study. Firstly, the cross-sectional design of NHANES does not allow us to establish a cause-and-effect relationship between urine caffeine metabolites and OA risk, as the exposure and outcome are measured at the same time. We cannot rule out the possibility of reverse association; for example, individuals with OA may change their caffeine consumption patterns. Additionally, although we adjusted for multiple covariates, unmeasured confounding factors such as comorbidities or other health-related factors may also have affected the observed results. A well-designed prospective cohort study is needed to investigate the association between caffeine consumption, related metabolites, and OA risk. Secondly, we imputed missing values using mean, median, or mode for certain covariates in this study, which might introduce bias to our findings. Thirdly, this study is an exploratory preliminary analysis of existing data, we did not account for multiple testing in the analyses of multiple metabolites and were unable to assess the detailed information on OA, such as, the types of OA (e.g., knee OA, hand OA, hip OA), the age of disease onset, the severity and progression of the disease, and other clinical characteristics, which limited our scope of our analysis. Lastly, we relied on self-reported OA, which might be limited by misclassification bias.
Conclusion
Overall, this study demonstrated that certain caffeine-related metabolites were positively associated with the prevalence of OA, especially among US women. It provided preliminary evidence suggesting that caffeine intake may be associated with OA. Well-designed large cohort studies are needed to investigate the prospective association between caffeine consumption, related metabolites, and risk of OA. Understanding the association between caffeine intake and OA risk could offer a potential strategy for OA prevention.
Footnotes
Acknowledgements
Used Grammarly for grammar and spelling check. No other generative AI was used in the study or while drafting this manuscript.
Ethical approval and informed consent statements
This study is a secondary data analysis involving human participants. The Ethics Review Board of the National Center for Health Statistics reviewed and approved the primary data collection study design. Thereby, no additional ethical approval was required for this study. The patients/ participants provided their written informed consent to participate in this study before data collection. Thereby, no additional informed consent is required for this study.
Consent for publication
The manuscript/ research does not contain data from any individual person, so consent for publication is not required/ applicable.
Authors’ contributions
Priya Sandhya Prakash contributed to data mining and writing—review and editing. Shike Xu contributed to methodology, data analysis, and writing—review and editing. Bing Lu contributed to conceptualization, methodology, data analysis, and writing—review and editing.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Institutes of Health (grant number R01074447A1).
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Availability of data and materials
Data were collected from the NHANES database, which is a publicly accessible and free resource (https://www.cdc.gov/nchs/nhanes/).
