Abstract
Background:
In a nationwide randomised controlled trial among 200 axial spondyloarthritis (axSpA) patients, the medical app Axia improved patient-reported disease activity scores, functional status and quality of life.
Objective:
This companion study aimed to explore Axia’s effects on objective parameters such as mobility, strength and imaging.
Design:
Single-centre, two-phase pre-post intervention study over 24 weeks.
Methods:
Thirty-two patients with axSpA on stable pharmacotherapy underwent 12 weeks of standard care (phase I) followed by 12 weeks of Axia use (phase II). The primary endpoint was Bath Ankylosing Spondylitis Metrology Index (BASMI) at week 24 (W24) versus week 12 (W12) and baseline. Secondary endpoints included muscle strength, Bath Ankylosing Spondylitis Disease Activity Index (BASDAI), safety and magnetic resonance imaging (MRI) of the sacroiliac joints.
Results:
Twenty-seven (84%) of 32 participants (mean age 49.1 years, 48.1% females and radiographic axSpA, 66.7% biological or targeted synthetic disease-modifying anti-rheumatic drugs therapy) completed the study. During standard care, BASMI (baseline 3.1; W12 3.0; p > 0.05) and BASDAI (baseline 4.7; W12 4.9; p > 0.05) remained unchanged, while median spinal extensor strength declined by 14%. During Axia use, BASMI improved to 2.4 (p < 0.001), BASDAI to 3.7 (p < 0.001) and muscle strength increased by 25% (p < 0.01). BASMI improvement was greater in patients with baseline MRI inflammation. MRI showed no increase or decrease in bone marrow oedema or structural damage. No app-related adverse events occurred.
Conclusion:
Axia use was associated with improved spinal mobility, extensor strength and disease activity, without relevant safety concerns.
Trial registration:
The study was registered in the German Clinical Trials Register (DRKS00038067).
Keywords
Introduction
Exercise therapy and patient education are key non-pharmacological pillars in the management of axial spondyloarthritis (axSpA) and are strongly recommended in all major international guidelines.1,2 However, integrating these interventions into everyday life remains challenging, and adherence is often suboptimal. 3
Digital health interventions may help bridge this gap, as exemplified by the smartphone application Axia. 4 Axia is a German language medical app that provides patient-tailored, axSpA-specific home-based exercise therapy combined with patient education and additional disease management features, and is only available in the German language so far.4,5 In a German single-centre randomised controlled trial (RCT) with nationwide recruitment, the Bechterew-App Trial I, a 12-week intervention with Axia led to significant and clinically meaningful improvements in patient-reported disease activity, functional status and quality of life in patients with axSpA receiving stable pharmacotherapy. 4 However, due to the fully remote design of this trial, aiming to reflect real-world conditions, only patient-reported outcome measures (PROMs) were assessed. 4
To further investigate the potential biomechanical benefits of this app intervention, a second interventional single-centre study with in-person hospital visits and comparable eligibility criteria was initiated at a German tertiary care centre (Bechterew-App Trial II). The primary aim of this companion study was to explore the effects of regular exercise therapy with Axia on objective disease-specific and biomechanical parameters, including spinal mobility, trunk muscle strength and body composition. Furthermore, adverse events (AEs), inflammation, and structural damage of the sacroiliac joints (SIJ) on magnetic resonance imaging (MRI) were assessed as safety outcomes. The study was registered in the German Clinical Trials Register (Deutsches Register Klinische Studien (DRKS)-ID: DRKS00038067).
Methods
Study design
The Bechterew-App Trial II was designed as an exploratory, single-centre, prospective, single-arm pre-post interventional study conducted over 24 weeks in patients with axSpA receiving stable pharmacotherapy (Figure 1 displays the study design). The trial was sponsored by the University Hospital of Wuerzburg. During phase I, participants received standard of care (SOC) alone for 12 weeks, followed by phase II, in which Axia was added to SOC for a further 12 weeks.

Study design and measurement protocol. All participants attended three study visits. At each visit, the same outcomes were assessed in a predefined and consistent order to minimise potential fatigue effects during the study protocol. MRI examinations were scheduled at the end of each visit; however, in some cases, scans had to be performed at the beginning of the visit for organisational reasons.
Physiotherapy and pharmacotherapy were to remain stable throughout the study. Changes in disease-modifying antirheumatic drug (DMARD) therapy due to disease flare or AEs resulted in withdrawal from the study. Owing to the study design, randomisation was not feasible and neither participants nor investigators were blinded.
The study was conducted in accordance with Good Clinical Practice and the Declaration of Helsinki. Ethical approval (approval no. 93/23-am; 31 July 2023) was obtained from the Ethics Committee of the medical faculty of the University of Wuerzburg, Germany (DE/EKBY13). The trial was registered in the German Clinical Trials Register (DRKS; DRKS00038067). The corresponding STROBE checklist is provided in the Supplemental Material.
Patient and public involvement
The national German patient self-help association Deutsche Vereinigung Morbus Bechterew (DVMB) and axSpA patients participated in the development and testing process of Axia. 4 Trial design and procedures were devised without any involvement of patients or DVMB representatives. The results of this study will be shared with the patient community in a patient-friendly language via the DVMB.
Participants
Eligibility was assessed during a screening appointment. For eligibility in the Bechterew-App Trial II, candidate participants were required to be at least 18 years old, not pregnant and to have a confirmed diagnosis of axSpA. A Bath ankylosing spondylitis disease activity index (BASDAI) of at least 3.0, sufficient digital literacy skills defined by the regular use of a messenger service, and stable axSpA-specific pharmacotherapy were further mandatory eligibility criteria.
The main exclusion criterion was a pre-existing engagement in high levels of physical exercise. The Supplemental File provides a comprehensive list of all inclusion and exclusion criteria.
Recruitment
All participants were recruited at the rheumatology outpatient department of the University Hospital of Wuerzburg. Potentially eligible patients were contacted by telephone and invited to participate in the trial by their treating rheumatologists at the outpatient department of the University Hospital of Wuerzburg. All patients were directly screened and informed about the trial by the principal investigator.
Intervention
After phase I with SOC and without any study intervention, participants received unlimited access to the medical app Axia for 12 weeks in addition to SOC in phase II (intervention phase). Figure 1 displays the study design. The frequency and extent of app use were ad libitum based on personal preferences by the participants and were not influenced by the study personnel. This approach was intended to reflect real-world conditions, with no additional input from study personnel.
The digital health application (DHA, German ‘DiGA’ (Digitale Gesundheitsanwendung)) Axia is a German-language, axSpA-specific and patient-tailored medical app. Axia was certified as a Conformité Européenne class I medical device under the European Medical Device Regulation (MDR) and has been approved as an official DHA for axSpA in Germany since the end of January 2026. Content and functions of Axia have already been described in detail elsewhere.4,5
Primary and secondary endpoints and their assessment
The primary endpoint was prespecified improvement in disease-specific spinal mobility as measured by the Bath Ankylosing Spondylitis Metrology Index (BASMI, ranging from 0 to 10) at week 24 in comparison to week 12, and secondarily to baseline. The BASMI was assessed according to the standard method by two medical students who had been pretrained by an experienced consultant in rheumatology. 6
Secondary endpoints included patient-reported disease activity (assessed by BASDAI, range 0–10), bioelectrical impedance analysis (BIA) for measuring body weight (kg), percentage of lean mass and body fat (%). 7 For BIA measurement, the SECA mBCA 515 was used. 8 Furthermore, the isometric maximum strength of the trunk flexor and spinal extensor muscles in Newton metre (Nm) by the dynamometric chair device Easytorque (provided by the Integrative and Experimental Exercise Science and Training, Institute for Sports Science of the University of Wuerzburg 9 ) and perceived functional strength by numeric rating scale (NRS) scores for perceived exertion (0–10) during four exercises (bird–dog crunches 60 s, glute bridge 55 s, forearm plank and biceps curl, both 45 s) were assessed for strength testing. Except for biceps curls, the selected exercises were chosen due to their integration in the exercise programmes of Axia. If an exercise could not be performed or was quit during performance time, an NRS of 10 was counted for this exercise. Biceps curls were added as a negative control because Axia does not focus on increasing arm strength and biceps curls are therefore not part of the exercise programmes provided by Axia. For isometric maximum strength, three measurements with the dynamometric chair device (Easytorque) were performed and the mean of these measurements was used for further analyses. At each visit, a native MRI scan of the SIJ was performed, including a coronal oblique T1-weighted sequence and a coronal oblique Turbo Inversion Recovery Magnitude sequence as a fat-suppressed T2-weighted sequence. The Spondyloarthritis Research Consortium of Canada (SPARCC) scores for sacroiliac inflammation (SIS), ranging from 0 to 72, and structural damage (SSS), with a range of 0–20 for ankylosis and backfill and 0–40 for fat metaplasia and erosion, were calculated by a senior consultant in radiology, specialised in musculoskeletal (MSK) radiology, according to the standardised methods.10,11 The radiologist was blinded to the clinical outcomes. All these secondary outcomes were assessed at baseline, week 12 and week 24.
Changes in BASMI, BASDAI and isometric strength in both phases were assessed as additional secondary endpoints. The usage frequency of Axia was objectively tracked for each participant during phase II, and AEs were assessed at each visit using a questionnaire and reassessed by the study personnel.
Per protocol, the participants first underwent assessment of BIA, followed by the BASMI and strength measurements to avoid fatigue or pain affecting the BASMI results. Subsequently, the questionnaires (BASDAI, AEs) were completed. The MRI assessments were performed at different time points depending on prescheduled appointments in the radiology department.
Sample size calculation
The sample size calculation was performed by Dr Victoria Ruecker, an independent statistician of the Institute for Epidemiology of the University Hospital of Wuerzburg. Based on the BASMI improvements reported by Yigit et al., 12 in a trial with a similar intervention, a common mean correlation of 0.5 between measurements was assumed. The significance level was set at 5%. To detect a difference in BASMI between measurement 1 and measurement 2 across the three time points with 80% power, a sample size of 24 participants was required. Assuming a dropout rate of 20%, a total of 30 participants was needed to be recruited. The sample size calculation was based on a one-factor repeated-measures analysis of variance (ANOVA) with a contrast between measurement 1 and measurement 2.
Statistical analysis
The primary endpoint was analysed using a one-way repeated-measures ANOVA with time (baseline, week 12 and week 24) as the within-subject factor. Normality of residuals was assessed before analysis. The primary comparison of interest, defined a priori, was the difference between week 12 and week 24. Bonferroni-adjusted post hoc tests were used for pairwise comparisons. Statistical significance was set at p < 0.05. Secondary endpoints were analysed using the same approach, where appropriate. If the normality assumption was not met, a Friedman test followed by Dunn’s post hoc multiple-comparison test was performed as a non-parametric alternative.
Changes within each study phase (BASMI, BASDAI and isometric strength) were analysed using two-sided paired Student’s t-tests when normality assumptions were met and Wilcoxon signed-rank tests otherwise. Correlations were assessed using Spearman’s rank correlation coefficient (ρ). For subgroup analysis, Mann–Whitney U test was applied to compare group differences in medians when normality of the residuals was not met.
Because of the exploratory nature of the secondary endpoints, no adjustment for multiple testing was performed. Results are presented as mean with standard deviation (SD) or median with interquartile range (IQR), as appropriate. All analyses were performed using GraphPad Prism V.5.0. GraphPad Software, Inc. La Jolla, CA, USA.
Results
Recruitment and baseline characteristics
Recruitment took place at the University Hospital of Wuerzburg between February 2024 and December 2024. Medical records from the outpatient department were screened to identify patients with a diagnosis of axSpA and a BASDAI ⩾3.0. Sixty patients were screened for eligibility, of whom 32 met the inclusion criteria and were enrolled. Five participants withdrew during the study, including two investigator-initiated withdrawals and three participant-initiated withdrawals. One participant withdrew informed consent before the baseline assessment. Four participants requested deletion of their data after withdrawal in accordance with German law. Overall, 27 participants completed the study until July 2025 and were included in the analysis. Figure 2 shows the study flow diagram. Baseline characteristics are presented in Table 1.

Flowchart of the study.
Baseline characteristics of the study population.
BASDAI, Bath ankylosing spondylitis disease activity index; BASMI, Bath ankylosing spondylitis metrology index; b/tsDMARD, biological or targeted synthetic disease-modifying anti-rheumatic drugs; MRI, magnetic resonance imaging; r-axSpA, radiographic axial spondyloarthritis; SD, standard deviation; SIS, sacroiliac inflammation score; SPARCC, Spondyloarthritis Research Consortium of Canada.
Primary outcomes
A one-way repeated-measures ANOVA showed a significant effect of time on BASMI (F (2,52) = 15.16, p < 0.001, η2 = 0.37). Bonferroni-adjusted post hoc comparisons showed no significant difference between baseline and Week 12, whereas BASMI was significantly lower at week 24 than at baseline and week 12 (baseline: 3.1 (SD 1.6); week 12: 3.0 (SD 1.6); week 24: 2.4 (SD 1.6); week 24 vs baseline: p < 0.001; week 24 vs week 12: p < 0.001; baseline vs week 12: p > 0.05; Figure 3(a)). Given the incomplete fulfillment of the normality assumption, a Friedman test was additionally performed as a sensitivity analysis and showed the same pattern of results (χ² (2) = 24.25, p < 0.001). Dunn’s post hoc comparisons revealed no difference between baseline and week 12, but significant differences between baseline and week 24 and between week 12 and week 24. These analyses consistently support a significant improvement in BASMI at week 24, indicating that the primary endpoint was met.

Changes in spinal mobility (BASMI) and disease activity (BASDAI). One-way repeated measures ANOVA showed significant improvement in mean BASMI and mean BASDAI at week 24 compared to week 12 and baseline (a, c). Results are presented as means with SD as error bars. Analysis of the median change in BASMI (b) and mean change of BASDAI (d) reveals significantly greater improvements in phase II (Axia intervention), which exceeded the MCID threshold (⩾1) in the case of BASDAI. (e) Median change in BASMI in phase II was greater in participants with inflammation on MRI at baseline, while median changes in BASDAI were independent of baseline MRI inflammation (f). Panels (b), (e) and (f) boxplots with the interquartile range and the central line as median. Whiskers showing the 10th and 90th percentiles with outliers presented as points. Panel (d) shows mean values with SEM as error bars.
Secondary outcomes
As a secondary endpoint, mean BASDAI was significantly lower at week 24 than at baseline and week 12 (baseline: 4.7 (SD 1.7); week 12: 4.9 (SD 2.1); week 24: 3.7 (SD 1.9); week 24 vs baseline: p < 0.001; week 24 vs week 12: p < 0.001; baseline vs week 12: p > 0.05; Figure 3(c)). Repeated-measures ANOVA showed a significant effect of time on BASDAI (F (2,52) = 12.80, p < 0.001, η2 = 0.33), corresponding to a large effect size.
Changes were greater during phase II (Axia intervention) than during phase I (SOC) for both BASMI and BASDAI. For BASMI, the median change was −0.2 (IQR −0.4 to 0.2) during phase I and −0.6 (IQR −0.8 to 0) during phase II (p = 0.026), corresponding to a moderate effect size (r = 0.43). For BASDAI, the mean change was 0.2 (SD 1.2) during phase I and −1.1 (SD 1.4) during phase II, with a mean paired difference of 1.3 points (95% CI 0.36–2.20; p = 0.008), corresponding to a moderate effect size (Cohen’s dz = 0.55). The reduction in BASDAI during phase II exceeded the minimal clinically important difference (MCID) threshold of ⩾1.0. 7 Median BASMI at baseline was numerically higher in patients without MRI-detectable inflammation than in those with inflammation (with inflammation (n = 10): 2.0 (IQR 1.7–4.2) vs w/o inflammation (n = 17): 3.2 (IQR 2.1–4.9)). By contrast, median BASDAI at baseline was higher in patients with active inflammation on MRI than in those without inflammation (with inflammation: 5.2 (IQR 3.5–6.8) vs w/o inflammation: 3.9 (IQR 3.6–5.1); Table S1). However, median improvement in BASMI in phase II was significantly greater in participants with MRI-detected inflammation of the SIJ than in those without inflammation (median change in BASMI with inflammation: −0.80 (IQR −1.05 to −0.40) vs w/o inflammation: −0.20 (IQR−0.70 to 0.00); p = 0.03; Figure 3(e)). By contrast, median changes in BASDAI in phase II did not differ significantly between participants with and without MRI-detected inflammation (Figure 3(f)). No relevant changes were observed for SPARCC SIS or SPARCC SSS over time (Figures S1 and S2).
The median absolute isometric strength of the spinal extensor muscles was significantly higher at week 24 than at week 12 (baseline: 217.4 Nm (IQR 166.8–291.1 Nm); week 12: 206.2 Nm (IQR 155.7–246.6 Nm); week 24: 241.6 Nm (IQR 183.0–324.5 Nm); p < 0.05 for week 24 vs week 12; Figure 4(a)). This corresponded to a median increase of 25.0% (IQR −4.1% to 53.7%) during phase II compared with a median decrease of 13.7% (IQR −22.9% to 2.6%) during phase I (p = 0.004; Figure 4(b)). No significant changes were observed for trunk flexor strength (Figure 4(c) and (d)).

Changes in isometric and perceived functional strength. A Friedman test with Dunn’s multiple comparisons post hoc analysis demonstrated a significant improvement in isometric strength of the spinal extensor muscles (Nm) at week 24 compared with week 12 (a). The median relative change in strength increased by 25% during phase II, whereas isometric strength decreased during phase I (b). No significant changes were observed in the isometric strength of the trunk flexor muscles (c, d). Perceived functional strength, assessed using an NRS for exertion, improved significantly for bird–dog crunches, glute bridges and forearm planks, whereas no changes were observed for biceps curls, which served as a negative control (e–h). Results are presented as bar plots showing means with SDs as error bars when residuals were normally distributed. In cases where normality assumptions were not met, data are shown as box plots displaying the IQR with the median indicated by the central line. Whiskers represent the 10th and 90th percentiles, and outliers are shown as individual data points.
For perceived functional strength, NRS ratings for exertion during the bird–dog crunch, glute bridge and forearm plank were significantly lower at week 24 than at baseline and week 12, whereas no improvement was observed for biceps curls (Figure 4(e)–(h)).
No relevant changes were observed in body weight, percentage of lean mass and percentage of body fat (Figure S3).
Changes in BASMI were inversely correlated with changes in isometric maximum strength of the spinal extensor muscles (ρ = −0.28, p < 0.05), whereas changes in BASMI were not significantly correlated with changes in BASDAI. NRS ratings for the bird–dog crunch, glute bridge and forearm plank were moderately correlated with each other (bird–dog crunch vs glute bridge: ρ = 0.49, p < 0.001; bird–dog crunch vs forearm plank: ρ = 0.44, p < 0.01; glute bridge vs forearm plank: ρ = 0.30, p < 0.05; Table S2). Improvement in spinal extensor muscle strength was moderately inversely correlated with NRS ratings for the glute bridge (ρ = −0.30, p < 0.05) and forearm plank (ρ = −0.43, p < 0.01; Table S2).
Adherence rates
App use was objectively tracked during the 12-week intervention period. On average, participants used the app on 66 days (SD 22), corresponding to 78% (SD 25) of the available intervention days. The mean total usage time was 1192 min (SD 956). Overall, 63% of participants used Axia on more than 75% of the intervention days, and 22% used the app daily. Total days of use, accumulated in-app points, total usage time (minutes) and the utilisation rate (percentage of actual vs possible usage days) were strongly intercorrelated. No correlations were found between usage metrics and improvements in BASMI or BASDAI (Table S3).
Safety
A total of 17 AEs were observed in the trial, with a numerically higher number in phase II (n = 13) versus phase I (n = 4). By contrast, more severe AEs (SAE) were reported in phase I (n = 2) versus phase II (n = 1). No device-related AEs were identified in phase II. All AEs are listed in Table 2.
Adverse events.
Sum scores are shwon in bold. AE, adverse event; MSK, musculoskeletal; n/a, not applicable; SAE, severe AE.
Discussion
In this exploratory interventional study, use of the disease-specific medical app Axia was associated with improved spinal mobility, spinal muscle strength and patient-reported disease activity in patients with axSpA receiving stable pharmacotherapy. The Bechterew-App Trial II adds objective data on disease-specific and biomechanical outcomes, thereby complementing the results of the Bechterew-App Trial I, which primarily assessed PROMs. 4 Axia represents a promising new approach for the comprehensive management of patients with axSpA. 4
Axia was used consistently throughout the study period. The frequency of use was comparable to that reported in the Bechterew-App Trial I. 4 Axia delivers disease-specific home-based exercise with a particular focus on the spine.4,5 The exercise programme dynamically adapts to patients’ individual needs and aims to progressively enhance both training intensity and volume.4,5 As a result, a 25% increase in isometric maximum strength of the spinal extensor muscles was observed, objectively measured using a dynamometer (Easytorque). In addition, the perceived exertion during trunk exercises – such as bird–dog crunches, forearm planks and glute bridges – regularly incorporated into the app-based training plans, improved over the course of the intervention, further supporting positive training effects. These findings are clinically relevant because previous studies have shown that strength is even more negatively affected than mobility in axSpA patients when compared to healthy controls. 13 Importantly, these improvements in strength were accompanied by significant improvements in disease-specific mobility, as demonstrated by positive correlations with the BASMI, an established endpoint in axSpA trials and the primary endpoint of this study. 6
That exercise therapy can improve disease-specific mobility is already a well-established concept, as demonstrated by several meta-analyses.14 –16 However, a major limitation of the current evidence base is the heterogeneity of interventions and study designs, which restricts direct comparability across trials.14 –16 A recent meta-analysis by Zhang et al. 14 reported an estimated mean BASMI reduction of −0.49 across all included studies. Notably, this analysis pooled a wide range of exercise modalities. In a comparable study setting with home-based exercise involving patients receiving TNF inhibitors, Yigit et al. 12 observed a mean BASMI reduction of −0.9. The somewhat greater reduction compared to our trial may be explained by the higher baseline BASMI score in their cohort (5.05), suggesting greater room for improvement than in our study population. 12 Even interventions with supervised high-intensity training could not reach such favourable outcomes as the cohort in this Turkish trial.12,17 In an RCT (n = 59) evaluating a DHA for nonspecific back pain for home-based exercise in a comparable study setting, BASMI scores remained stable in the intervention group (n = 30), whereas the control group (n = 29) exhibited a worsening of BASMI over time. 18 A MCID for the BASMI in accordance with other Bath indices is not defined. 7 Even in RCTs of bDMARDs, typical improvements in ΔBASMI generally range from about −0.3 to −1.0. 19 Against this background, the observed change of −0.6 in our study can be considered as a clinically meaningful improvement.
As an important secondary endpoint, patient-reported disease activity was assessed using the BASDAI. Consistent with the findings of the Bechterew-App Trial I, participants demonstrated a significant improvement in BASDAI scores, exceeding the established MCID thresholds (BASDAI improvement ⩾1).4,7 This underscores the reproducibility of the app’s effects that were observed in the RCT. The mean improvement was slightly smaller than that observed in Bechterew-App Trial I, which may be explained by the lower mandatory minimum BASDAI score required for inclusion in Trial II (3.0 vs 3.5 in Trial I), resulting in lower baseline BASDAI values in the present study. 4 The correlation between BASMI and BASDAI is often only modest. 20 The absence of a significant association in our study may be attributable to the limited sample size.
The question remains from the Bechterew-App trial I, whether the observed improvement in BASDAI following app-induced increase in physical activity is accompanied by a direct reduction in inflammatory activity. 4 Some studies have reported decreases in inflammatory biomarkers (e.g., IL-6 and CRP) in patients with axSpA following structured exercise interventions.17,21 Due to the regulatory framework governing our study, which was conducted with an unapproved medical device at the time of this trial, sampling of inflammatory biomarkers via blood sampling was not permitted. We therefore could only assess MRIs of the SIJ primarily performed as a safety endpoint. Interestingly, participants with SIJ inflammation on MRI showed significantly greater improvements in BASMI than those without inflammation. This finding is somewhat surprising, given that the group with MRI-detected inflammation had lower baseline BASMI values, indicating better mobility despite active inflammation of the SIJ. By contrast, we found that participants with active inflammatory changes on MRI had higher baseline BASDAI scores as expected.22,23 However, improvement in BASDAI during phase II did not differ significantly between the two subgroups, precluding any conclusion as to whether patients with active inflammation show a better response in patient-reported disease activity. In contrast to BASDAI, the association between BASMI and SIJ inflammation is not well established.22 –24 No significant changes in SPARCC SIS were observed. To our knowledge, this is the first study to investigate the effect of exercise therapy on MRI-detected inflammatory changes in patients with axSpA; consequently, no reference data are available for comparison and interpretation. Furthermore, the association between MRI-detected inflammatory activity and circulating inflammatory biomarkers in axSpA appears to be less pronounced than in other iRMDs. 25 The absence of changes in the SPARCC SIS, despite improvement in BASDAI, may be explained by the fact that the presence of MRI-confirmed SIJ inflammation was not an inclusion criterion in this trial, and the study was therefore not powered for this endpoint. Accordingly, the mean baseline SPARCC SIS was low (1.8), and only a subset of 10 patients demonstrated active inflammatory lesions of the SIJ at study entry. For logistical and economic reasons, MRI of the spine was not performed in our trial; therefore, no conclusions can be drawn regarding potential inflammatory changes in the spine.
Conversely, the absence of increased inflammatory or structural MRI changes may be interpreted as reassuring from a safety perspective, which was a primary aim of the study. Despite intensified exercise of previously affected joints, no increase in bone marrow oedema or structural damage was detected. Finally, the 12-week observation period may have been too short to detect meaningful structural or imaging changes in terms of treatment response.26,27 Consistent with these findings, no device-related AEs were observed, which is in line with the results of the Bechterew-App Trial I. 4 Thus, the present study further supports the safe use of Axia in axSpA patients.
No effects on body composition or body weight were observed. This may be explained by the fact that dietary factors generally exert a greater influence on weight change than increases in physical activity alone. 28 As Axia did not include a nutritional intervention, a substantial impact on body weight or body composition was not expected.
However, the results of this trial should be interpreted with caution due to several limitations. First, the study was unblinded for both participants and investigators. However, this is an inherent challenge in trials evaluating medical devices, in contrast to pharmacological studies. Blinding is difficult to achieve when an intervention involves active device use, particularly in the case of CE-marked medical devices. Participants can easily determine whether or not they are receiving the verum intervention (medical app) or a placebo.
Second, we did not employ a classical RCT design with a parallel control group. Instead, we used a pre–post interventional design in which participants served as their own controls during the pre-intervention phase. This approach was chosen because the larger nationwide RCT for DHA (DiGA) approval was conducted in parallel, and we aimed to avoid competitive recruitment effects. Moreover, given the impossibility of blinding, we were concerned that participants randomised to the control group might be less willing to complete the full study protocol – including potentially stressful MRI and strength testing assessments – without receiving any potential benefit from an intervention/app use.
Third, the study was a single-centre study. All in-hospital visits were conducted at a single centre, which may limit the generalisability of the findings to the broader German and European population.
Additional limitations should also be considered: The sample size was modest and primarily powered for the BASMI endpoint; therefore, analyses of secondary, imaging and correlation outcomes should be regarded as exploratory. Recruitment at a single tertiary centre and the requirement for German language proficiency, smartphone access and digital literacy may have introduced selection bias and may limit generalisability. In addition, five participants withdrew, and data from four participants had to be deleted after withdrawal in accordance with German law, precluding a more comprehensive missing-data analysis. Finally, app use was ad libitum, and repeated functional assessments may have been influenced by learning effects, which cannot be fully disentangled from intervention-related changes in the absence of a parallel control group.
But the present trial also has important strengths compared with the Bechterew-App Trial I, which was conducted entirely remotely and relied exclusively on PROMs. 4 In our study, all participants were assessed in person during in-hospital visits. Comprehensive medical histories were available. Since most participants were treated at the University Hospital of Wuerzburg, pharmacological and physiotherapeutic treatments could be monitored more closely during the study period compared to the RCT. In addition to PROMs, objective outcome parameters were available, complementing the findings of the Bechterew-App Trial I. Furthermore, baseline characteristics were more balanced in this cohort compared to the Bechterew-App Trial I, particularly with regard to the male-to-female ratio. 4 Another important difference concerns recruitment: most participants in the present study were actively invited and did not self-refer for participation. By contrast, recruitment for the Bechterew-App Trial I included outreach via social media channels of a patient self-help organisation, which may have resulted in the inclusion of particularly motivated individuals. 4
This trial provides the first important insights into positive biomechanical effects associated with a 12-week intervention using the medical app Axia. These findings meaningfully complement the previously reported improvements in patient-reported disease activity, functional status and quality of life observed in the Bechterew-App Trial I. 4 Nevertheless, these results require confirmation in larger multicentre studies with controlled study designs. In particular, the question of whether the observed improvements in BASDAI directly reflect a reduction in inflammatory activity could not be conclusively addressed in the present trial. However, the study was neither specifically designed nor statistically powered to answer this question. A major limitation was the inability to collect blood samples due to the regulatory classification of the study under the MDR. Following the approval of Axia as a fully reimbursable DHA (DiGA) in Germany, the collection of laboratory parameters and the performance of more comprehensive MRI assessments during routine clinical use of the app have become feasible. This may facilitate further studies addressing the remaining open questions.
Conclusion
In conclusion, this trial provides supporting evidence that the use of the medical app Axia is associated with beneficial improvements in disease-related biomechanical impairments and is not merely linked to enhanced self-efficacy or psychological effects only.
Supplemental Material
sj-docx-1-tab-10.1177_1759720X261453287 – Supplemental material for Association of an app-based intervention with improvements in mobility, trunk muscle strength and patient-reported disease activity in axial spondyloarthritis: a 24-week pre–post study
Supplemental material, sj-docx-1-tab-10.1177_1759720X261453287 for Association of an app-based intervention with improvements in mobility, trunk muscle strength and patient-reported disease activity in axial spondyloarthritis: a 24-week pre–post study by Patrick-Pascal Strunz, Marc Schmalzing, Amelie Wüst, Patricia Possler, Tobias Heusinger, Maxime le Maire, Anna Fleischer, Thorsten Bley, Michael Gernert, Hannah Labinsky, Ottar Gadeholt, Robert Leppich, Astrid Schmieder, Ludwig Hammel, Billy Sperlich, Ann-Cathrin Koschker, Hermann Einsele, Matthias Froehlich and Karsten Sebastian Luetkens in Therapeutic Advances in Musculoskeletal Disease
Supplemental Material
sj-docx-2-tab-10.1177_1759720X261453287 – Supplemental material for Association of an app-based intervention with improvements in mobility, trunk muscle strength and patient-reported disease activity in axial spondyloarthritis: a 24-week pre–post study
Supplemental material, sj-docx-2-tab-10.1177_1759720X261453287 for Association of an app-based intervention with improvements in mobility, trunk muscle strength and patient-reported disease activity in axial spondyloarthritis: a 24-week pre–post study by Patrick-Pascal Strunz, Marc Schmalzing, Amelie Wüst, Patricia Possler, Tobias Heusinger, Maxime le Maire, Anna Fleischer, Thorsten Bley, Michael Gernert, Hannah Labinsky, Ottar Gadeholt, Robert Leppich, Astrid Schmieder, Ludwig Hammel, Billy Sperlich, Ann-Cathrin Koschker, Hermann Einsele, Matthias Froehlich and Karsten Sebastian Luetkens in Therapeutic Advances in Musculoskeletal Disease
Footnotes
Appendix
Acknowledgements
Data from this work were presented as a poster and an oral presentation at the German Rheumatology Congress 2025 in Wiesbaden and at the EULAR congress 2026 in London. We would like to thank all the patients who participated in the trial. Furthermore, we would like to thank the German patient self-help association (DVMB) for supporting us. We are also grateful to our statistician Dr Victoria Rücker, for performing the sample size calculation. Applimeda is the developer and rights holder of Axia. M.l.M., T.H. and R.L. are shareholders, partners or employees of Applimeda. All other authors are not financially dependent on Applimeda.
Declarations
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
