AI-Enhanced Imaging for Diabetic Foot Ulcer Risk Assessment and Diagnosis: A Retrospective Cohort Study

Abstract

Background:

The automated assessment and prediction of diabetic foot ulcer (DFU) severity depends heavily on precise segmentation of the ulcer region. This approach avoided reliance on built-in segmentation tools, which often lacked the accuracy needed to delineate wound boundaries effectively. The objective of this study was to develop and evaluate an artificial intelligence (AI)-driven method for ulcer segmentation and severity classification of DFU using Wagner’s grading system.

Methods:

A novel method was introduced for segmenting the boundaries of DFUs, paired with a lightweight classification model for predicting ulcer severity as per Wagner’s grade. This method was developed using a retrospective cohort of patients in India. A total of 1339 ulcer images were collected from 510 patients and augmented to 6579 images for AI-model generalizability. It incorporated an enhanced active contour model, combined with Sobel edge detection, to achieve precise delineation of ulcer edges. An AI-powered mobile application was developed to facilitate the real-time and remote assessment of the severity of DFUs.

Results:

The proposed segmentation approach successfully delineated ulcer regions, achieving a Dice similarity coefficient of 0.99. The classification model attained an accuracy of 95.58%, with a sensitivity of 95.58%, a specificity of 99.16%, and an F1 score of 95.53%. The method also recorded a false-positive rate of 0.84% and a false negative rate of 4.83%, reflecting improved classification performance compared to existing methods.

Conclusions:

The comparative analysis demonstrated that the proposed method significantly improved both segmentation and classification of DFUs, thereby supporting enhanced clinical management of the condition.

Keywords

active contour model diabetic foot ulcer MobileNetV3-Small segmentation Sobel operator Wagner grading system

Introduction

According to the International Diabetes Federation, approximately 463 million adults worldwide, equivalent to one in every eleven individuals, have diabetes. One of the most severe complications associated with diabetes is diabetic foot ulcer (DFU), which significantly impacts patients’ quality of life and increases the risk of lower limb amputation.¹ The DFUs often arise from poorly managed diabetes, leading to neuropathy, which can result in skin damage primarily on the plantar surface of the foot and beneath the halluxes, potentially exposing underlying tissues and increasing the risk of bone damage. The lifetime risk of developing a DFU among people with diabetes ranged from 19% to 34%.² The pathophysiology of DFUs involves a combination of neuropathy, trauma, and, frequently, peripheral arterial disease. Recent advancements in computer vision algorithms have enhanced medical imaging applications for detecting lesions, such as DFUs.³ Automated detection, segmentation, and classification of DFUs remained critical areas of research. Wagner’s grading system was commonly employed to categorize DFUs by severity, ranging from intact skin but foot at risk (Grade 0) to extensive gangrene affecting the whole foot (Grade 5).^4,5 Accurate diagnosis of DFUs requires a detailed medical history, thorough foot examination, and supporting diagnostic tests, including imaging. This study aimed to improve the automated segmentation of ulcer boundaries and the classification of ulcer severity grades using image processing and deep-learning techniques, aligning with Wagner’s grading system.⁶ In addition, this classification model was deployed in a smartphone environment for early screening and prompt treatment actions and plans.

This article is organized as follows: section “ Introduction” introduces the study; section “Literature Survey” provides a comprehensive survey of prior studies; section “Data Set Description” describes the data set; section “Materials” outlines the materials, including data preprocessing, augmentation, image rescale; section “Methods” explains the methods employed, including wound region estimation, segmentation techniques, and severity classification model; section “Results and Analysis” presents the experimental results along with detailed analysis; section “Discussion” provides a comparative discussion of the present study with prior works; and section “Conclusion” concludes this article, and proposing directions for future research.

Literature Survey

The literature review has been structured into three subsections for clarity: (1) segmentation-only methods, (2) classification-only methods, and (3) hybrid pipelines.

Segmentation-Only Methods

Several studies have addressed DFU segmentation using classical and deep-learning–based approaches.

Niri et al⁷ introduced a superpixel-based convolutional neural network (CNN) architecture using U-Net, achieving 92.68 % accuracy and a Dice score of 75.74% on a data set of 219 images by employing (Simple Linear Iterative Clustering) SLIC superpixels and morphological operations. Heras-Tang et al⁸ developed a hybrid segmentation algorithm combining logistic regression, Density-Based Spatial Clustering of Applications with Noise (DBSCAN), and morphological operations, which resulted in 94% accuracy, 86% sensitivity, and 91% precision, but was applied to approximately 70 images without external validation. Wang et al⁹ proposed Improved Self-Adaptive Gradient Vector Flow (ISAGVF) as an enhancement to the Gradient Vector Flow (GVF) Snake model, improving edge preservation and deep concavity convergence. Chen et al¹⁰ introduced an Active Contour Model based on Jeffreys divergence and K-medoids clustering, achieving segmentation efficiencies of 94.5% (Fuzzy C-means [FCM]), 91.5% (K-means), and 95.7% (K-medoids), although applicability was limited to double-phase images. Peng et al¹¹ developed a 2D OTSU multi-threshold segmentation algorithm using an optimized genetic algorithm that improved peak signal-to-noise ratio (PSNR) by 13.3768 and reduced segmentation time. Canales et al¹² presented GA11, an automated algorithm that selected optimal segmentation parameters using a genetic algorithm, achieving a Generalized Cross Entropy (GCE) of 0.0240, outperforming Principal Component Analys (PCA’s) 0.0185. EU-Net¹³ combined a U-Net architecture with a differential evolutionary algorithm, improving segmentation accuracy for liver (95.9 %), left kidney (90.0 %), right kidney (86.8 %), and spleen (87.6 %). Wang et al¹⁴ created a MobileNetV2-based framework for segmenting wound regions, achieving a Dice score of 90.4% on an augmented data set of 5000 images generated from 1109 originals but did not include severity assessment.

Classification-Only Methods

Alzubaidi et al¹⁵ applied 18-fold augmentation to a 754-image data set for binary classification (normal vs DFU), achieving 95.8% accuracy, but without segmentation or severity grading. Al-Garaawi et al¹⁶ developed a CNN-based classification method using Mapped Binary Patterns from RGB images, reporting an area under the curve (AUC) of 98.1% and F1-score of 95.2%. Toofanee et al¹⁷ emphasized rapid DFU intervention and proposed a Siamese Neural Network combining CNN and Vision Transformers, achieving 95% accuracy for four-category ulcer classification.

Hybrid Segmentation-And-Classification Pipelines

In Motta et al,¹⁸ K-means clustering followed by an artificial neural network yielded an error rate of less than 8%, integrating segmentation with downstream classification. Rajathi et al¹⁹ designed the DUTC-Net for DFU classification, attaining 97.9% sensitivity and 97.5% accuracy in stage classification.

Limitations of Prior Studies

Although prior research has advanced the analysis of DFUs, these studies remain constrained by several limitations:

Small data sets and lack of augmentation: Niri et al⁷ used only 219 images without any data augmentation, limiting the generalizability of their U-Net-based approach. Heras-Tang et al⁸ applied their hybrid logistic regression and clustering method to ~70 images but did not perform external validation, restricting clinical applicability.

Segmentation without severity assessment or classification: Wang et al¹⁴ augmented a data set of 1109 images to 5000 for wound segmentation using MobileNetV2, but their framework excluded classification or severity assessment.

Insufficient external validation: Rajathi et al¹⁹ performed both tissue and stage classification on 1500 varicose ulcer images and validated results using leave-one-out cross-validation, yet no external testing was conducted.

Fragmented workflows (classification-only, without segmentation or severity grading): Alzubaidi et al¹⁵ applied 18-fold augmentation to a 754-image data set for binary classification (normal vs DFU), but their method did not incorporate segmentation or severity grading.

Few studies integrate segmentation and ulcer grading within a unified framework or validate models under varied image acquisition conditions. As a result, most approaches remain fragmented and less applicable to real-world clinical workflows.

Data Set Description

This retrospective randomized observational study was conducted among resident Indian patients with type 2 diabetes at hospitals in Kolkata, India, from 2012 to 2021. A large data set was used to train and test the classification model.

Study Population

Our ulcer wound data set consisted of 1339 images obtained from 510 outpatient visits at ILS Hospital, Kolkata, West Bengal, which functions as a referral center serving a broad catchment area across eastern India and included patients not only from Kolkata and its suburban areas (eg, Howrah, Barasat, Madhyamgram, Salt Lake, New Town, Durgapur, Burdwan) but also from neighboring states such as Bihar, Jharkhand, and Tripura. This demonstrates that the data set reflects a geographically and demographically diverse population across eastern India, rather than being restricted to a single urban catchment area. The images were captured using a Samsung Galaxy S II (GT-I9100) smartphone, released in 2011, equipped with an 8-megapixel rear camera (f/2.6, 1/3.2” sensor) capable of 1080p resolution. The device employed a Super AMOLED Plus display with autofocus and LED flash support. During acquisition, a flexigrid was placed adjacent to the ulcer to capture wound dimensions (height and breadth). For each visit, two images were captured: one before debridement and one immediately after. Images were stored in JPEG format at the device’s native resolution (3264 × 2448 pixels). Image acquisition steps involved:

All patients underwent initial assessment using a standardized foot proforma.

Ulcer images were captured before and after debridement and at each follow-up visit to monitor progression.

A flexigrid was used during imaging to ensure accurate scaling of ulcer dimensions.

Images were taken in the medical examination room or at the bedside for severe cases, following a consistent protocol.

Infection status was assessed according to the Infectious Diseases Society of America (IDSA) classification, which categorizes infections as mild, moderate, and severe. Deep-tissue cultures were taken from all patients, and a negative culture result was interpreted as confirmation of infection control. Ischemia was assessed in every patient using the Ankle-Brachial Index (ABI), where values between 0.90 and 1.40 were considered normal, 0.89 and 0.70 as mild disease, 0.69 and 0.50 as moderate disease, and less than 0.50 as severe disease. Charcot foot was identified when the temperature difference between the affected foot and the contralateral foot was greater than 2°C, after other causes were excluded. The diagnosis was then confirmed radiographically by identifying characteristic bony changes on a foot X-ray. In this study, “healing at eight weeks” was defined as an area reduction of the ulcer region by more than 50%.

To ensure consistency in image quality and wound visibility, photographs were taken post-debridement in cases involving slough, and heavy dressings were removed before image acquisition. These measures were implemented to minimize variability and improve the reliability of the data set.

A clinical expert from ILS Hospital, Kolkata, India, annotated all ulcer images according to the Wagner grading system (grades 0-5). Given that this was a retrospective study using existing, de-identified data, formal ethical approval was not required, as per institutional guidelines. To maintain patient confidentiality, particularly with sensitive image data, we used pseudonymization. This method replaced all original patient identifiers with randomly generated, non-identifiable tokens. This approach allowed us to securely link data for analytical purposes (such as diagnosis or age) while ensuring that a patient’s true identity remained protected. The critical mapping between original identifiers and pseudonymized tokens is secured in a separate, highly restricted-access system that adheres to data protection protocols. To improve model generalization, we employed data augmentation techniques, such as cropping, rotation, and flipping, which increased the sample size to 6579. Table 1 outlines the severity grades along with the corresponding number of images for each grade.

Table 1.

Wagner’s classification of DFU images in different grades.

Ulcer severity	Description	Number of images
Grade 0	Pre-foot ulceration	15
Grade 1	Partial foot ulceration	177
Grade 2	Deep ulcer to bone, ligament, tendon, or joint	491
Grade 3	Deep foot ulcer with osteomyelitis	119
Grade 4	Ischemic foot gangrene	342
Grade 5	Entire foot gangrene	195

The inclusion criteria for participants in this study were as follows: (a) diagnosed with type 2 diabetes, (b) diagnosed with diabetic neuropathy or peripheral arterial disease, (c) having an active foot ulcer below the ankle, (d) aged between 16 and 80 years, and (e) capable of using a handheld device. The exclusion criteria included (a) individuals aged under 16 or over 80, (b) those with a total foot amputation, (c) patients with other types of peripheral neuropathy (such as those caused by alcohol abuse, familial conditions, or hypovitaminosis), and (d) poor-quality images, such as those that are blurry, too dark, or too bright. The data set comprises images with varying resolutions, spanning from 576 × 1040 to 4608 × 4608 pixels. Representative examples of these graded images are presented in Figure 1.

Figure 1.

Annotation of ulcer wounds as per Wagner’s grades.

Study Population—Train, Test, and Validation Data Sets

The patient data set was randomly divided into training and testing subsets in an 80:20 ratio.

In addition, internal validation was conducted on 600 image samples, separate from the total of 6579 samples, to ensure the model’s generalizability. The internal cohort of 600 patients had no overlap with the training and test data sets, ie, the image samples differ in terms of geography and/or time. Similar inclusion and exclusion criteria were applied to this image data set.

Materials

Data Preprocessing

Data Augmentation

To overcome the challenge of limited image samples and reduce the risk of overfitting, data augmentation techniques were employed. Deep-learning models, especially convolutional neural networks (CNNs), require large volumes of training data to perform optimally. With a smaller data set, the CNN’s parameters were less likely to converge effectively, leading to overfitting. To address this, various augmentation strategies, including cropping, random rotations at 45, 90, 135, and 180 degrees, as well as horizontal and vertical flipping, were applied. As a result, the data set was expanded to a total of 6579 samples. These enhancements improved the model’s generalization ability by introducing greater variability into the training data.

Image Rescale

To improve efficiency, all images were resized to a uniform dimension of 300 × 300 pixels. This process decreased the computational load and ensured consistent input dimensions for subsequent stages, facilitating smoother processing.

Grayscale Conversion

After resizing, the images were converted to grayscale, as different tissues exhibited distinct grayscale values, allowing for targeted analysis of intensity variations that are critical for detecting ulcer regions.

Aims

We developed a multi-stage segmentation and classification model for DFUs aimed at improving early and accurate diagnosis based on Wagner’s grading system. The model included the following:

A novel segmentation approach was developed to detect and isolate the foot ulcer region by constructing a bounding box, which was subsequently used for classifying ulcer severity based on Wagner grades.

A carefully structured, multi-stage segmentation pipeline was designed, comprising (a) morphological operations for noise reduction and structural refinement, (b) an improved Active Contour Model (iACM) that enabled robust boundary fitting, even in complex wound geometries, (c) a Sobel Operator (SO) that enhanced gradient transitions and guided contour convergence, and (d) Contour Extraction (CE) for further refinement of the segmentation output.

A lightweight MobileNetV3-Small deep-learning model was employed to categorize DFUs across six Wagner grades, from early stage lesions (Grade 0) to advanced gangrene (Grade 5). This compact architecture was deliberately selected to support seamless integration into a smartphone application, aligning to enable accessible, app-based early screening in health care settings.

Validation was performed with an independent patient cohort with no overlap with the training data, ensuring the model’s performance reflected its robustness in real-world clinical scenarios.

Methods

Following image preprocessing, the ulcer region of interest (ROI) was detected using a combination of Canny edge detection, image binarization, and morphological operations. This multi-step approach enhanced structural features and isolated the wound area for precise segmentation. A lightweight classifier was then employed to assign Wagner grades to the segmented ulcers. As this study exclusively utilized ulcer images for segmentation and classification, no tabular clinical data or patient-level variables were incorporated into the modeling process. Consequently, missing data handling and Least Absolute Shrinkage and Selection Operator (LASSO)-based variable selection procedures were not applicable. The complete workflow, spanning preprocessing, wound segmentation, and severity classification are depicted in Figure 2.

Figure 2.

Process overview of the proposed segmentation method and classification model of DFU.

Refinement of Ulcer Images

This section focuses on utilizing the Canny edge detector, image binarization, and morphological operations to identify the ROI or ulcerous wound regions for segmentation.

Canny Edge Detector

The Canny Edge Detector²⁰ is a multi-stage algorithm that effectively detects edges or potential boundaries of ulcers with a low error rate. It begins by applying a Gaussian filter to remove noise from the ulcer images, followed by non-maximum suppression that eliminates false edges of the ulcer regions. The Canny Edge Detector generates edge maps, which highlight the boundaries of the ulcer images, as shown in Figure 3.

Figure 3.

Ulcer edge maps after applying the Canny edge detector.

Binarization

The edge maps generated by the Canny detector were transformed into binary masks using a threshold-based binarization process. This step enhanced the contrast between the ulcer (foreground) and the surrounding tissue (background), making it easier to isolate contour boundaries specific to the affected region. By leveraging pixel intensity gradients, the binarization step ensured smooth and distinct edge delineation. Pixels with intensities above the defined threshold of 80 were treated as strong edges and assigned a value of 255 (white), while those at or below the threshold were set to 0 (black). The resulting binary image clearly outlined the ulcer margins as bright contours against a dark backdrop, streamlining the segmentation process.

Morphological Operations

Once the binary mask was generated, morphological operations such as erosion and dilation were applied to remove noise from the binary ulcer images. Erosion smoothed and reduced the boundaries of the ulcer regions, while dilation enhanced them.²¹ A closing operation was performed prior to opening to further refine the ulcer contours.²¹ Let $I_{b}$ denote the binary image and $S$ be the structural element. Erosion (E) and dilation (D) were defined in equations 1 and 2, respectively.

E (I_{b}, S) = I_{b} ⊖ S

(1)

D (I_{b}, S) = I_{b} \oplus S

(2)

Closing Operation

The closing operation, which consisted of dilation followed by erosion, was used to fill small holes and gaps within the foreground regions of the ulcer image. This process helped produce a more cohesive representation of the target features, thereby enhancing the overall quality of the binary image. The closing operation was defined in equation 3.

I_{b} • S = (I_{b} \oplus S) ⊖ S

(3)

Opening Operation

The opening operation reduced noise while preserving the structural integrity of the ulcer regions. Erosion was first applied to eliminate small noise artifacts, although it also slightly shrank the ulcer boundaries. To compensate, dilation was subsequently used to restore the eroded ulcer regions without introducing additional noise. This combined process effectively removed thin lines while maintaining the shape and size of the significant ulcer areas in the binary image, as illustrated in Figure 4. The mathematical representation of the opening operation was provided in equation 4.

I_{b} ○ S = (I_{b} ⊖ S) \oplus S

(4)

Figure 4.

Ulcer edge maps after applying binarization and morphological operations.

Segmentation of Ulcer Region

After generating the edge maps of the ulcers, the active contour model (commonly known as the snake model)²² was applied to segment the wound area. However, due to its reliance on the initial selection of edge points, the conventional snake model struggled to segment complex objects automatically. To overcome this limitation, the model was enhanced to improve its effectiveness.

Improved Active Contour Model

We enhanced the original snake model²³ using a controlled continuity spline framework, influenced by both image-derived and external constraint forces. Internal spline forces maintained piecewise smoothness, while image-based forces directed the snake toward key features such as edges and contours. The position of the snake was represented parametrically as $v (s) = (x (s), y (s)),$ and the corresponding energy function was defined in equation 5.

E_{s n a k e} = \int_{0}^{1} (E_{i n t e r n a l} + E_{i m a g e} + E_{e x t e r n a l}) d s

(5)

The term $E_{i n t e r n a l}$ represented the internal energy resulting from bending, while $E_{i m a g e}$ accounted for the forces derived from the binary image. In addition, $E_{e x t e r n a l}$ denoted the external constraint forces acting on the contour. The mathematical formulations of the internal and external forces were provided in equations 6 and 7, respectively.

E_{i n t e r n a l} = α (s) {| C^{'} (s) |}^{2} + β (s) {| C^{''} (s) |}^{2}

(6)

E_{e x t e r n a l} (x, y) = - {| \nabla I (x, y) |}^{2}

(7)

In this context, $I (x, y)$ represented the coordinates of the binary image, ∇ denoted the gradient operator, and $C (s) = [x (s), y (s)]$ described the contour curve, where s was a parameter ranging from 0 to 1. The first- and second-order derivatives of the contour with respect to s were represented as $C^{'} (s)$ and $C^{″} (s)$ , respectively, while β(s) served as the second-order continuity weighting function of $C (s)$ . The iACM iteratively updated the contour by minimizing an energy function that balanced the internal energy (smoothness) with the external energy derived from the binary image gradient. This approach enabled accurate fitting of the contour to the ulcer boundaries, aiding in precise region isolation. The contour was initially placed as a circle centered around the centroid of the binary mask, which was determined using image moments. These moments, denoted by $M,$ were computed from the noise-removed binary mask produced by morphological operations. The centroid coordinates ( $C_{x}$ , $C_{y}$ ) were calculated using the spatial moments of the ulcer images, as defined in equations 8 and 9.

C_{x} = \frac{M_{01}}{M_{00}}

(8)

C_{y} = \frac{M_{10}}{M_{00}}

(9)

$M_{00}$ was defined as the zero-order moment, while $M_{01}$ and $M_{10}$ represented the first-order moments. The contour was initialized around the centroids in a circular shape with a radius $R$ . The coordinates of the points along this initial contour $(r (θ), c (θ))$ were computed using equations 10 and 11.

r (θ) = C_{x} + R \sin (θ)

(10)

c (θ) = C_{y} + R \cos (θ)

(11)

The radius $R$ is a predefined constant, and $θ \in (0, 2 π)$ . In equation 6, $α$ controls the contour’s elasticity and $β$ controls the rigidity or smoothness. $γ$ is the explicit time-stepping parameter that influences the model’s convergence speed. These adjustments enable a balanced trade-off between internal and external energies, allowing the model to dynamically adjust and accurately encapsulate the ulcer contours, and effectively detect the ulcer boundaries. From this segmented contour, the coordinates of the bounding box are extracted to ensure adherence to image dimensions.

Sobel Operator

The SO²⁴ was applied within the region defined by the bounding box generated by the iACM to detect and delineate the prominent edges of the ulcer area. This technique significantly enhanced the visual clarity of the ulcer boundaries, thereby facilitating the extraction of precise contours. Two 3 × 3 convolution kernels were used for this process, as illustrated in equations 12 and 13.

K_{x} = [\begin{matrix} - 1 & 0 & + 1 \\ - 2 & 0 & + 2 \\ - 1 & 0 & + 1 \end{matrix}]

(12)

K_{y} = [\begin{matrix} + 1 & + 2 & + 1 \\ 0 & 0 & 0 \\ - 1 & - 2 & - 1 \end{matrix}]

(13)

with $I$ as the original image, the corresponding gradients were defined in equations 14 and 15. A gradient magnitude image was generated using equation 16 to encapsulate the cohesive edge information within the ulcer region.

G_{x} = K_{x} * I = [\begin{matrix} - 1 & 0 & + 1 \\ - 2 & 0 & + 2 \\ - 1 & 0 & + 1 \end{matrix}] * I

(14)

G_{y} = K_{y} * I = [\begin{matrix} + 1 & + 2 & + 1 \\ 0 & 0 & 0 \\ - 1 & - 2 & - 1 \end{matrix}] * I

(15)

G = \sqrt{G_{x}^{2} + G_{y}^{2}}

(16)

Binary Mask Creation

At this stage, a refined binary mask had been generated from the outputs of the iACM and the SO. This step removed thin lines and irrelevant small objects, retaining only the ulcer region. The resulting binary mask is illustrated in Figure 5.

Figure 5.

Binary mask of the wound region after applying iACM and SO.

Contour Extraction

From the refined binary mask, contours were extracted. Among the detected contours, the most significant one was selected, as it accurately represented the ulcer region. A bounding box was then generated around the isolated ulcer to ensure alignment with the image dimensions, as illustrated in Figure 6. The generated segmentation masks were systematically stored in structured folders, categorized by the severity grade of their corresponding original images. This ensured each mask was accurately linked to its clinically assigned severity grade, which then served as input for training the classification model, thereby eliminating the need for manual annotation of the segmentation masks.

Figure 6.

Bounding box of the segmented ulcer region.

Classification of Ulcer Severity

Figure 7 illustrated the workflow of our proposed iACMSOCE method with the MobileNetV3-Small classifier. The MobileNetV3-Small classifier, which is pre-trained on ImageNet, is used to classify the severity of ulcer wounds. This variant was chosen over the MobileNetV3-Large due to its compact architecture, lower parameter count, and reduced number of floating-point operations. Its efficient design allows for faster inference and lower memory usage, making it well-suited for mobile applications. We have developed a mobile app that integrates this lightweight classifier, making it ideal for use in a smartphone environment. The mobile app is designed to be simple and user-friendly for both patients and health care providers, as shown in Figure 8. New users complete a quick registration, providing basic personal and medical details (name, email, username, password, gender, and age). The app calculates body mass index (BMI) and offers a notes section for additional health information. The interface is centered around a dashboard with a bottom navigation bar for key functions: Dashboard, Scan, History, and Profile. The dashboard provides a summary of the user’s activity and record completeness. The Scan feature allows users to either capture a new photo of an affected foot or upload an existing one. This image is then combined with the user’s health data to run the built-in prediction model for DFU detection and grading. The results, including ulcer grade and risk level, are presented clearly on an output screen. All results are saved in the History section, enabling users to track their condition over time and share updates with their doctors. The workflow is a straightforward progression: Registration or Login → Profile Setup → Dashboard → Image Capture/Upload → AI Prediction & Result → History Tracking. Users have the flexibility to skip medical data entry and add it later. This AI-powered mobile application was designed to support primary care physicians, community health workers, remote clinicians, and patients by providing real-time assessment of DFU severity using the Wagner grading system. Its primary utility lies in facilitating early detection of DFUs to enable timely intervention, guiding precise treatment planning based on severity classification, and enhancing remote monitoring for continuity of care, particularly in underserved areas. Furthermore, the application ensures secure, centralized access to patient images and records through cloud storage, thereby bridging the gap between advanced technical capabilities and practical clinical utility, especially in resource-limited or remote settings. The app interface displays ulcer severity outcomes based on inputs provided by the user, which include clinical features and images of foot ulcers. In the context of mobile deployment, images can be captured or uploaded from the mobile device and are transmitted to the cloud backend using secure, encrypted communication protocols (eg, HTTPS with TLS 1.2). To minimize risk, images are stored on the mobile device for the absolute minimum necessary time, ideally not at all, before deletion upon successful upload. For secure, centralized storage and data persistence, all patient data, including personal details and medical imagery, were maintained on a confidential cloud server. The app was deployed on a cloud hosting platform, allowing for real-time, remote DFU severity prediction through a lightweight classifier. This architecture enabled seamless and secure communication between the app and the server, making it accessible to clinicians, patients, and remote health care providers from any location.

Figure 7.

Schematic representation of the workflow of the proposed iACMSOCE-based segmentation method and MobileNetV3-Small classifier.

Figure 8.

Design pipeline of the AI-powered mobile app for DFU severity prediction.

Results and Analysis

All experiments were performed using Python on an Intel i5 processor equipped with 8 GB of RAM. A data set comprising 510 patients was collected, initially containing 1339 image samples. These were subsequently augmented to a total of 6579 samples. The data set was divided into train and test sets in an 80:20 ratio, yielding 5264 samples for training and 1315 for testing. The proposed method was validated using an independent cohort of 600 patients, with predictor variables consistent with those in the augmented data set (Table 2). To enhance the prominence of ulcer edges, one iteration of closing and two iterations of opening were applied using a 3 × 3 kernel. All ulcer images were resized to 300 × 300 pixels to improve generalization. The radius $R$ for the iACM was set to 150, and the hyperparameters $α, β$ , and γ were assigned values of 0.06, 0.1, and 0.001, respectively, to ensure optimal performance.

Table 2.

Distribution of augmented ulcer images across Wagner’s grades in the augmented data set.

Ulcer grades	Description	Number of images
Grade 0	Pre-foot ulceration	820
Grade 1	Partial foot ulceration	828
Grade 2	Deep ulcer to bone, ligament, tendon, or joint	800
Grade 3	Deep foot ulcer with osteomyelitis	840
Grade 4	Ischemic foot gangrene	1799
Grade 5	Entire foot gangrene	1492

Edge detection was performed using the SO with a $3 \times 3$ kernel. The MobileNetV3-Small, with 17 layers, was deployed for classification, and the final classifier layer was modified to accommodate the number of severity grades in the data set. The Adam optimizer was used with a learning rate of 0.001 during model training to achieve faster convergence compared to (Stochastic Gradient Descent) SGD and AdaGrad. The segmentation accuracy of our iACMSOCE method was evaluated using the Dice Similarity Coefficient (DSC), a robust measure for assessing the spatial overlap between the predicted and ground-truth segmentation masks. The formula used for calculating DSC was presented in equation 17.

D S C = \frac{2 \times a r e a o f o v e r l a p}{t o t a l a r e a}

(17)

The area of overlap indicated the intersection between the predicted and ground-truth masks, where a Dice Coefficient of 1 represented perfect alignment and 0 denoted no overlap. A total of 300 images were analyzed, with 50 images for each of Wagner’s grades (0-5). The computed Dice scores were as follows: Grade 0 (0.97), Grade 1 (0.99), Grade 2 (0.99), Grade 3 (0.99), Grade 4 (0.99), and Grade 5 (0.98). The model consistently achieved Dice scores between 0.97 and 0.99, demonstrating high accuracy in segmenting ulcer regions across all severity levels. This level of precision contributed to a reduction in misclassification errors during the severity grading. We compared our iACMSOCE segmentation method with several existing segmentation approaches.^19,25-27 The bounding box in Figure 9 highlights that the iACMSOCE method outperformed existing segmentation approaches in accurately delineating the ulcer region. Different classifiers, including MobileNetV3, Artificial Neural Networks, and MobileNetV2,²⁸ were employed to analyze existing approaches^19,25-27 and evaluate classification accuracy. A brief interpretation of each metric and its clinical relevance is as follows:

Accuracy reflects the overall proportion of DFUs correctly classified into their respective Wagner grades.

Precision indicates the proportion of DFUs predicted as a specific grade (eg, Grade 3) that are truly of that grade. High precision minimizes over-classification and avoids unnecessary escalation of care.

Recall measures the proportion of actual DFUs of a given grade (eg, Grade 4) that the model correctly identifies. High recall is essential for detecting severe cases requiring urgent intervention.

Specificity is the model’s ability to correctly identify DFUs that are not a specific grade, preventing it from incorrectly flagging low-grade DFUs as more severe.

F1-score balances precision and recall, offering a robust measure of performance, especially in data sets with class imbalance.

False-Negative Rate (FNR) quantifies the proportion of severe DFUs that the model fails to identify.

The performance of each approach was assessed based on accuracy, precision, sensitivity, specificity, F1 score, false-positive rate (FPR), and FNR as presented in Table 3.

Figure 9.

Segmented ulcer (ROI) using the iACMSOCE method and existing methods.

Table 3.

Comparative evaluation of iACMSOCE method coupled with MobileNetV3-Small versus existing segmentation methods and classifiers.

Evaluation metrics	ACM¹⁹	K-means²⁵	FCM²⁵	ResNet18²⁶	GVF²⁷	iACMSOCE with MobileNetV3-Small
Accuracy	80.20	89.03	91.12	88.57	94.97	95.58
Precision	79.92	89.10	89.58	88.61	95.13	95.54
Sensitivity	80.20	89.03	89.94	88.57	94.97	95.58
Specificity	95.44	95.54	98.92	93.12	98.12	99.16
F1-Score	80.00	88.99	88.72	88.44	94.83	95.53
FPR	4.56	10.88	1.08	9.12	9.88	0.84
FNR	23.76	7.97	6.35	19.43	5.03	4.83

The iACMSOCE segmentation with the MobileNetV3-Small classifier achieved an accuracy of 95.58%, a precision of 95.54%, a sensitivity of 95.58%, a specificity of 99.16%, and an F1 score of 95.53% after 100 testing epochs. The proposed method achieved the lowest FPR (0.84%) and FNR (4.83%). In contrast, ACM¹⁹ recorded 4.56% and 23.76%, K-means²⁵ reported 10.88% and 7.97%, FCM²⁵ reported 1.08% and 6.35%, ResNet18²⁶ recorded 9.12% and 19.43%, and GVF²⁷ reported 9.88% and 5.03%, respectively. This demonstrates that the iACMSOCE with MobileNetV3-Small consistently minimized both false positives and false negatives, thereby improving overall reliability in clinical assessment. To further evaluate the generalizability of the proposed model, internal validation was conducted using 600 ulcer images collected from the same hospital but from a different patient cohort than those included in the training and testing sets. The validation results demonstrated a strong specificity of 94.83%, effectively reducing false-positive rates. Furthermore, the method achieved a sensitivity of 89.04% and a precision of 86.80% on the validation data set.

Discussion

The Wagner Grading System was selected for this technical feasibility study due to its direct clinical relevance to ulcer depth, aligning with our image-based visual assessment. Its simplicity ensures consistent application and enhances inter-rater reliability for initial model validation. As it is commonly documented in retrospective and resource-limited settings, it provided a pragmatic choice for early development. Wagner also maintains its role as a foundational tool in clinical guidelines, complementing broader assessment frameworks. Although the Wagner grading system’s use has declined in favor of more comprehensive frameworks such as the wound/ischemia/foot infection (WIfI), University of Texas (UT), and International Working Group on the Diabetic Foot (IWGDF) systems, some centers still employ it. Our collaborating clinical sites have since transitioned to using these updated systems. In future studies, we plan to evaluate and integrate these comprehensive frameworks into our algorithms to enhance clinical relevance and alignment with current practice. This study serves as a technical proof of concept, demonstrating the model’s ability to identify key visual features. Future work will incorporate ischemic and systemic parameters for classification with comprehensive systems like WIfI and UT to enhance clinical utility. Wagner’s continued relevance is supported by recent clinical studies^29-31 which utilize or compare this grading system. The iACMSOCE with MobileNetV3-Small framework is a multi-stage pipeline integrating an iACM with Sobel edge detection, CE, and a lightweight MobileNetV3-Small classifier. Our method utilized a diverse data set of 1339 original ulcer images from 510 patients. This was significantly augmented to 6579 images through various techniques, including rotations, scaling, and brightness variations. This comprehensive augmentation strategy was crucial for enhancing the classifier’s generalization across different ulcer grades and effectively mitigating overfitting, a common issue observed in previous studies with smaller data sets. In contrast to earlier efforts such as Niri et al⁷ with 219 images, Heras-Tang et al⁸ with 37 images, and Rajathi et al¹⁹ with 1500 images, iACMSOCE with MobileNetV3-Small benefited from broader visual diversity. Similar augmentation strategies were employed by Wang et al⁹ with 4050 images, Yu et al¹³ with 62 124 images, Alzubaidi et al¹⁵ with 13 572 images, and Toofanee et al¹⁷ with 15 683 images, all of whom reported strong classification results using large data sets and hybrid networks. The iACMSOCE with MobileNetV3-Small framework achieved 95.58% accuracy, 95.58% sensitivity, and an F1 score of 95.53% while remaining lightweight and suitable for mobile deployment. The performance measures highlighted how targeted data augmentation improved both training diversity and the efficacy of compact classification models in clinical practice. Statistically, iACMSOCE outperformed a wide range of existing DFU segmentation and classification methods. For segmentation, it achieved DSCs between 0.97 and 0.99 across all Wagner grades, exceeding the performance of Niri et al’s⁷ superpixel-based CNN (Dice: 0.75 on 219 images) and Wang et al’s¹⁴ MobileNetV2 framework (Dice: 0.90 on 5000 images). Hybrid techniques, such as those by Heras-Tang et al⁸ and Chen et al,¹⁰ delivered moderate results (eg, Jaccard Index: 0.81, Accuracy: 94%, Sensitivity: 86%) but lacked robustness across complex ulcer geometries. In classification, iACMSOCE achieved 95.58% accuracy, 95.58% sensitivity, 99.16% specificity, and an F1 score of 95.53% while maintaining a low FPR of 0.84% and an FNR of 4.83%. A 4.83% FNR indicates that approximately 5% of severe DFU cases may be misclassified, resulting in critical clinical consequences, including delayed treatment, accelerated disease progression, increased amputation risk, patient harm, and a higher health care burden. While this study is a technical feasibility assessment, we prioritize reducing this FNR through refinement, broader validation, and clinical integration for patient safety. Comparative performance included ACM (80.2% accuracy, 80.20% sensitivity, 95.44% specificity, F1: 80.00%),¹⁹ K-means (89.03% accuracy, F1: 88.99%),²⁵ FCM (91.12% accuracy, F1: 88.72%),²⁵ ResNet18 (88.57% accuracy, F1: 88.44%),²⁶ and GVF (94.97% accuracy, F1: 94.83%).²⁷ Although DUTC-Net¹⁹ achieved strong scores (97.5% accuracy, 97.1% sensitivity, 98% specificity), its reliance on computationally heavy preprocessing, such as reflection and hair removal, made it less feasible for mobile applications. Its lightweight architecture, paired with strong performance, makes iACMSOCE highly suitable for resource-constrained mobile health environments, offering both precision and scalability for real-time DFU screening and severity classification. The proposed model serves primarily as a triage and remote monitoring tool to assist in the early detection and management of DFUs. It enables non-specialists and primary care providers to identify ulcer severity at an early stage, facilitating prompt referral, earlier diagnosis of infection, and timely management. A pilot test involving 100 patients at ILS Hospital confirmed the app’s ease of use, smooth workflow, and desired results. While larger usability tests are planned for the future, the current version shows significant promise as a tool for DFU risk prediction.

Despite the strong performance of the proposed framework, a few limitations must be acknowledged. The retrospective design of this study carries a risk of selection bias. Furthermore, its single-center origin limits the generalizability of our findings to broader populations. Since patient demographics, clinical practices, and wound care protocols may vary across regions and institutions, multi-center data collection in future studies will help validate the model’s robustness and ensure wider clinical applicability. External validation and model calibration are currently being performed as part of an ongoing follow-up study and will be presented in a future publication. A significant methodological constraint is the reliance on Wagner’s Classification, which does not incorporate ischemic parameters, thereby omitting a critical dimension when assessing the true severity of DFUs. This study did not include vascular status or offloading practices in the data set, which may influence ulcer healing outcomes and affect the overall performance and generalizability of the model.

Conclusion

The proposed method integrates an iACM, the SO, and CE to accurately isolate ulcer regions in foot images. The segmented regions were classified using MobileNetV3-Small, a lightweight deep-learning model optimized for resource-constrained devices. Utilizing a comprehensive data set spanning six severity grades, the novel algorithm automates the segmentation process without relying on any in-built tools. A comparative study demonstrates that this method improves classification accuracy and outperforms current state-of-the-art techniques across all evaluation metrics. By combining an optimized segmentation algorithm with a lightweight classification architecture, this approach facilitates efficient foot ulcer grading, enabling timely clinical decision-making and enhancing patient outcomes. In addition, a mobile app was developed for real-time severity grading of ulcers according to the Wagner classification system. In the future, the model will be validated using an external data set that does not overlap with the training and testing data sets to ensure its robustness and generalizability.

Footnotes

Acknowledgements

The authors would like to thank the staff at the ILS Hospital, Kolkata, India, for their assistance with data collection. The authors confirm that this manuscript does not use generative AI for creating ulcer images, tables, applying any mathematical operations, formulas, during ulcer segmentation, accuracy results, Dice scores, or references. Generative AI was used exclusively to improve the English language structure of the text.

Abbreviations

ACM, active contour model; AI, artificial intelligence; CE, contour extraction; CNN, convolutional neural network; DFU, diabetic foot ulcer; DSC, dice similarity coefficient; FCM, Fuzzy C-means; FNR, false-negative rate; FPR, false-positive rate; GVF, gradient vector flow; iACM, improved active contour model; iACMSOCE, improved active contour model + Sobel operator + contour extraction; UT, University of Texas classification; PSNR, peak signal-to-noise ratio; ROI, region of interest; SO, Sobel operator; U-Net, U-shaped convolutional network; WIfI, wound/ischemia/foot infection classification system.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Saswati Mukherjee

References

Saeedi

Petersohn

Salpea

, et al. Global and regional diabetes prevalence estimates for 2019 and projections for 2030 and 2045: results from the international diabetes federation diabetes Atlas. Diabetes Res Clin Pract. 2019;157:107843.

Akkus

Sert

Diabetic foot ulcers: a devastating complication of diabetes mellitus continues non-stop in spite of new medical treatment modalities. World J Diabetes. 2022;13(12):1106.

Goyal

Oakley

Bansal

Dancey

Yap

MH.

Automatic lesion boundary segmentation in dermoscopic images with ensemble deep learning methods. arXiv. doi:10.48550/arXiv.1902.00809.

Wagner

FW.

The diabetic foot. Orthopaedics. 1987;10(1):163-172.

Syauta

Mulawardi

Prihantono Hendarto

, et al. Risk factors affecting the degree of diabetic foot ulcers according to Wagner classification in diabetic foot patients. Med Clin Pract. 2021;4:100231.

Monteiro-Soares

Hamilton

Russell

, et al. Classification of foot ulcers in people with diabetes: a systematic review. Diabetes Metab Res Rev. 2024;40(3):e3645.

Niri

Douzi

Lucas

Treuillet

. A superpixel-wise fully convolutional neural network approach for diabetic foot ulcer tissue classification. In: Del Bimbo

, ed. Pattern Recognition. ICPR International Workshops and Challenges. Cham: IEEE; 2021:308-320.

Heras-Tang

Valdés-Santiago

León-Mecías

ÁM

Baguer Díaz-Romañach

Mesejo-Chiong

JA.

Diabetic foot ulcer segmentation using logistic regression, DBSCAN clustering and mathematical morphology operators. Electron Lett Comput Vis Image Anal. 2022;21(2):022-039.

Wang

Dang

Liu

Wang

Image segmentation using active contours with image structure adaptive gradient vector flow external force. Front Appl Math Stat. 2023;9:1271296.

10.

Chen

Wang

Weng

An active contour model based on Jeffreys divergence and clustering technology for image segmentation. J Vis Commun Image Represent. 2024;99:104069.

11.

Peng

Wang

Tong

Zou

Liu

Zhang

Multi-threshold image segmentation of 2D OTSU inland ships based on improved genetic algorithm. PLoS ONE. 2023;18(8):e0290750.

12.

Canales

García-Lamont

Yee-Rendon

Castilla

JSR

Mazahua

LR.

Optimal segmentation of image datasets by genetic algorithms using color spaces. Expert Syst Appl. 2024;238:121950.

13.

Wang

Tang

Feng

EU-net: automatic U-Net neural architecture search with differential evolutionary algorithm for medical image segmentation. Comput Biol Med. 2023;167:107579.

14.

Wang

Anisuzzaman

Williamson

, et al. Fully automatic wound segmentation with deep convolutional neural networks. Sci Rep. 2020;10:21897.

15.

Alzubaidi

Abbood

Fadhel

Al-Shamma

Zhang

Comparison of hybrid convolutional neural networks models for diabetic foot ulcer classification. J Eng Sci Technol. 2021;16(3):2001-2017.

16.

Al-Garaawi

Ebsim

Alharan

AFH

Yap

MH.

Diabetic foot ulcer classification using mapped binary patterns and convolutional neural networks. Comput Biol Med. 2022;140:105055.

17.

Toofanee

MSA

Dowlut

Hamroun

, et al. DFU-Siam: a novel diabetic foot ulcer classification with deep learning. IEEE Access. 2023;11:98315-98332.

18.

Motta

Marques

Guimarães

Ferreira

Rosa

The evaluation of the healing process of diabetic foot wounds using image segmentation and neural networks classification. Int J Biomed Eng Technol. 2022;38(2):179-192.

19.

Rajathi

Chinnasamy

Selvakumari

DUTC net: a novel deep ulcer tissue classification network with stage prediction and treatment plan recommendation. Biomed Signal Process Control. 2024;90:105855.

20.

Canny

A computational approach to edge detection. IEEE Trans Pattern Anal Mach Intell. 1986;8(6):679-698.

21.

Bhutada

Yashwanth

Dheeraj

Shekar

Opening and closing in morphological image processing. World J Adv Res Rev. 2022;14(3):687-695.

22.

Kass

Witkin

Terzopoulos

Snakes: active contour models. Int J Comput Vis. 1988;1(4):321-331.

23.

Behara

Bhero

Agee

JT.

An improved skin lesion classification using a hybrid approach with active contour snake model and lightweight attention-guided capsule networks. Diagnostics. 2024;14(6):636.

24.

Sobel

. An isotropic 3×3 image gradient operator. In: Freeman

, ed. Machine Vision for Three-Dimensional Scenes. San Diego, CA: Academic Press; 1990:376-379.

25.

Wiharto

Suryani

The comparison of clustering algorithms K-means and fuzzy C-means for segmentation retinal blood vessels. Acta Inform Med. 2020;28(1):42-47.

26.

Gómez-Flores

de Albuquerque Pereira

WC.

A comparative study of pre-trained convolutional neural networks for semantic segmentation of breast tumours in ultrasound. Comput Biol Med. 2020;126:104036.

27.

Maryani

Anwar

Abimanyu

Accuracy of prostate volume measurement on 2D transabdominal USG modality using the gradient vector flow (GVF) segmentation application measurement technique. Int J Sci Horiz. 2023;2(10):725-733.

28.

Huynh

Tran

TN.

Classification of stages of diabetic retinopathy using MobileNetV2 model. Kalpa Publ Eng. 2022;4:147-157.

29.

Han

Zhang

, et al, eds. Deep learning methods for real-time detection and analysis of Wagner ulcer classification system. In: 2022 International Conference on Computer Applications Technology (CCAT). Guangzhou: IEEE; 2022:11-21.

30.

Zhou

Tao

Hou

, et al. Construction and validation of a deep learning-based diagnostic model for segmentation and classification of diabetic foot. Front Endocrinol. 2025;16:1543192. doi:10.3389/fendo.2025.1543192.

31.

Mutailipu

Zhang

Zhu

. Correlation analysis of nutritional status of diabetic foot patients with different Wagner grades [published online ahead of print August 22, 2023]. Int J Diabetes Dev Ctries. doi:10.1007/s13410-023-01224-1.