Automated Tuberculosis classification using Egret Swarm Optimization with deep learning based fusion model on chest X-ray images

Abstract

In the last decades, Tuberculosis (TB) can be considered a serious illness affecting people over the globe and it leads to mortality when left untreated. Chest X-Ray (CXR) is the topmost selection for the recognition of pulmonary diseases in hospitals since it can be cost-efficient and easily available in many nations. But, manual CXR image screening is a huge load for radiologists, which results in a maximum inter-observer discrepancy rate. At present, Computer-Aided Detection (CAD) is a powerful imaging equipment for detecting and screening dangerous ailments. In recent times, Deep Learning (DL) based CAD schemes have demonstrated positive outcomes in the recognition of TB diseases. This study introduces an Egret Swarm Optimization Algorithm with Deep Feature Fusion based Tuberculosis Classification (ESOA-DFFTC) technique on CXR Images. The presented ESOA-DFFTC technique utilizes feature fusion and tuning processes for the classification of TB. To accomplish this, the ESOA-DFFTC model first exploits the Gaussian Filtering (GF) approach for image denoising purposes. Next, the ESOA-DFFTC model performs a feature fusion process using three DL models namely ResNeXt-50, MobileNetv2, and Xception. To enhance the achievement of the DL models, the ESOA-based hyperparameter optimizer is implemented in the study. For TB classification, the ESOA-DFFTC methodology uses an Arithmetic Optimization Algorithm (AOA) with Weight-Dropped Long Short-Term Memory (WDLSTM) methodology. The investigational output of the ESOA-DFFTC system was examined on a benchmark medical imaging dataset. A wide comparative investigation stated the greater achievement of the ESOA-DFFTC system over other current algorithms.

Keywords

Computer-aided diagnosis machine learning Tuberculosis chest X-Ray images feature fusion Metaheuristics

1 Introduction

Tuberculosis (TB) is a transmissible ailment majorly transmitted by the bacteria type “Mycobacterium TB” that can be transmitted from an individual to another through the air via sneezing, coughing, and spitting water [1]. TB affects the lungs, which can be named pulmonary TB; and affects other human parts like the bones and spine that are known as extrapulmonary TB. Precise diagnoses of TB need a reliable and precise interpreting of the testing outcomes [2]. Thus, WHO suggested utilizing CXR for detecting pulmonary malformations because of their low cost, availability, and high sensitivity [3]. Diagnosis related to CXR images needs wide practice, is labour-intensive, and is vulnerable to an individual’s casual mistake. In spite of limited medical resources and the lack of skilled radiologists, this recommendation was made [4]. Thus, with the advance of computer vision technologies, Artificial Intelligence (AI)-assisted diagnoses of disease, and digital imaging systems, those problems have been tackled. This technology has been assessed for the diagnosis of biomedical images [5]. In the healthcare sector, AI technology has been more commonly utilized for supporting physicians and radiologists in enhancing their diagnosis decision for various diseases [6]. AI methods were implemented with several techniques to rise the diagnosis precision. The AI method aims to study from the inputted data to forecast the upcoming unseen cases.

CAD tool was a promising support to radiotherapists and an advantage in medical imaging in TB detection [7]. Several studies were performed to frame a powerful diagnosis mechanism [8]. For instance, a CAD method has been constructed for diagnosing the TB cavity that finds areas of interest in CXR images. It addressed the disadvantages of the prevailing CAD mechanism that fails to detect TB cavities because of the superimposed anatomic part in the pulmonary region. Likewise, a CAD method was formulated that straightly identifies TB [9]. This growth has paved way for a clear image of the pulmonary surface to detect opaque or mass lesions, resulting in a greater targeted TB diagnosis. Current developments in AI technology have resulted in the expansion of methods that could identify more TB features [10]. Various DL mechanisms were advanced in recent times for analyzing digital CXR for TB-oriented abnormality that could address current shortcomings, minimalizing human inter-reader variabilities and reproducibility and providing radiologic services where medical specialists are unavailable.

This study introduces an Egret Swarm Optimization Algorithm with Deep Feature Fusion based Tuberculosis Classification (ESOA-DFFTC) technique on Chest X-Ray Images. The presented ESOA-DFFTC method utilizes feature fusion and tuning processes for the classification of TB. Next, the ESOA-DFFTC method performs a feature fusion process using three DL models namely ResNeXt-50, MobileNetv2, and Xception with the ESOA-based hyperparameter optimizer. For TB classification, the ESOA-DFFTC methodology uses an Arithmetic Optimization Algorithm (AOA) with Weight-Dropped Long Short-Term Memory (WDLSTM) methodology. The investigational output of the ESOA-DFFTC method is examined on a benchmark medical imaging data.

2 Related works

Xu and Yuan [11] presented an automatic and low-cost detection technique of pulmonary TB images on CXR to assist primary radiologists. A pulmonary TB classification technique based on CNN has been introduced that makes use of DL to categorize CXR scans. The study presents a coordinate attention module into CNN (VGG16), in such a way that the algorithm could capture direction sensing and position sensing information along with cross-channel information, to classify and identify pulmonary TB images. In [12], the study aims to design a robust TB detection technique that relies upon stochastic learning with Artificial Neural Network (ANN) by a randomized variation utilizing CXR image. This method is capable of incorporating random functions into the networking, whether by allocating stochastic transfer functions or weight into the networking.

Rahman et al. [13] focus on detecting and diagnosing TB in CXR by integrating a pretrained Deep Convolutional Neural Network (DCNN) with the Machine Learning (ML) technique. Incorporated the deep pretrained DenseNet201 model with the ML XGBoost technique for creating a hybrid method to classify patients as TB patients and normal. The presented method categorizes them using the XGBoost classifier and extracts features with the use of the pre-trained DenseNet201 neural network. In [14], the effects of the Transfer Learning (TL) method on the CXR databases for TB recognition are compared and explored with the presented Deep Neural Network (DNN) model. Unlike the complicated TL methods, the presented method has a standard structure with a dropout of some layers, and batch normalization for dealing with model overfitting and limited parameters.

Raju et al. [15] introduced a DL-oriented method for the automated recognition of TB screening from CXR imaging. India ranks first in TB cases. For active TB diagnosis, a chest radiograph in symptomatic patients was utilized. This screening technique can be preferably done at the primary healthcare centres where a doctor is available and occasionally through mobile CXR units. The main difficulty of this technique of screening was the further follow-up of patients and timely reporting for treatment initiation. The authors built many CNNs, the existing DL techniques, to construct the method for automated TB diagnosis. Singh and Hamde [16] presented an automated TB detection technique which utilizes standard digital chest radiographs. The technique contains 3 main phases. By morphological techniques and using the log Gabor filtering technique, the authors initially extracted the pulmonary regions from the CXR. A shape and texture feature set of the segmented database is calculated. For classifying the input CXR into TB-infected and healthy, the feature vector computed allows the Support Vector Machines (SVM).

Ignatius et al. [17] presented a deep CNN method relevant to histogram-matched CXR images that are not necessarily needed for object interest segmenting, and this paired technique of histogram matching with the CXRs enhances the detection performance and precision of CNN methods for TB diagnosis. Moreover, this study includes 2 discrete experimentations that leveraged the images of CXR without and with histogram matching for classifying non-TB and TB from CXRs utilizing DCNNs. It could precisely identify TB from CXR images utilizing deep CNN models pre-processing, and data augmentation. Iqbal et al. [18] presented a potential and direct DL network named TBXNet, which precisely categorizes an enormous quantity of TB from CXR images. The dual convolutional blocks were integrated with a pretrained layer in the networking’s fusion layer. Also, the pretrained layer was implemented for moving pretrained data into the fusion layer.

3 The proposed model

In the current research, a novel ESOA-DFFTC methodology is presented for automated TB classification on CXR images. This ESOA-DFFTC methodology exploited the concepts of feature fusion and hyperparameter tuning process for TB classification. It comprises a series of subprocesses namely the fusion-based feature extraction, GF-based preprocessing, AOA and ESOA based tuning and optimization, and WDLSTM classification processes. Figure 1 portrays the work flow of the ESOA-DFFTC model.

Fig. 1

Overall process of ESOA-DFFTC system.

3.1 Image pre-processing

Image denoising is the crucial image pre-processing phase for removing the image noise [19]. Generally, the conventional denoising approach focuses on smoothing images by allocating equivalent weights to each pixel in the image. But the best denoising technique allocates unequal weights to the pixel. Especially, the GF is a sequential smoothing filtering technique that decreases the weight allocated to the pixel by increasing the distance from the central pixels based on the Gaussian function. The input pixel in GF is weighted based on the following expression: $g (x, y) = \frac{1}{σ \sqrt{2 π}} e^{- \frac{d^{2}}{2 σ^{2}}}$ (1)

Where, $= \sqrt{(x - x_{c})^{2} + (x - x_{c})^{2}}$ , corresponding to the pixel distance [x, y] from central pixels [x_c, y_c] .

3.2 Fusion-based feature extraction

The feature fusion process uses three DL models namely ResNeXt-50, MobileNetv2, and Xception. Data fusion was utilized in several applications like ML and CV approaches. Feature fusion is an essential function, which combines many feature vectors [20]. The presented technique depends on feature fusion utilizing entropy defined by: $f_{ResNet 1 \times n} = {\begin{matrix} {ResNet}_{1 \times 1}, {ResNet}_{1 \times 1_{1 \times 2}}, \\ {ResNet}_{1 \times 1_{1 \times 3}}, \dots, {ResNet}_{1 \times 1_{1 \times n}} \end{matrix}}$ (2)

$\begin{matrix} f_{EfficientNet 191 \times m} \\ = {\begin{matrix} {EfficientNet}_{1 \times 1}, {EfficientNet}_{1 \times 2}, \\ {EfficientNet}_{1 \times 3}, \dots, {EfficientNet}_{1 \times n} \end{matrix}} \end{matrix}$ (3) $f_{LBP 1 \times p} = {{DLBP}_{1 \times 1}, {DLBP}_{1 \times 2}, {DLBP}_{1 \times 3}, \dots, {DLBP}_{1 \times n}}$ (4)

Additionally, the extraction feature was fused in individual vectors. $\begin{matrix} Fused {(featuresvector)}_{1 \times q} \\ = \sum_{i = 1}^{3} {fResNet}_{1 \times n}, {fEfficientNet}_{1 \times m}, {fDLBP}_{1 \times p} \end{matrix}$ (5)

Which, f represents the fused vector (1x1186). The entropy was utilized on the feature vector for choosing better features based on the score. The FS system was arithmetically labelled in Equations (2)– (5). Entropy was executed for choosing 1186 score-based features in 7835 features. $B_{He} = - {NHe}_{b} \sum_{i = 1}^{n} p (f_{i})$ (6) $F_{select} = B_{He} (max (f_{i}, 1186))$ (7)

In Equations (6) & (7), p implies the feature probability and He denotes the entropy. At last, the chosen feature can be given to the classifier in order to differentiate the TB and healthy images.

3.2.1 ResNeXt-50 model

Residual Networks (ResNet) bypass the input dataset and retain the reliability of the input dataset during the entire process [21]. The depth and rate of learning of the network will be affecting the deep layer by losing gradient; the rate of learning issues could be resolved by supplementing layers to the network which optimizes the functioning of the network. $H (i) = f (W * i) + b$ (8)

In Equation (8), f denotes the activation function; the output will be anywhere the networks take the shortcut way. $H (i) = f (i) + I$ (9)

The convolution blocks and amount of deep layers and the networking’s that uses those two identities are the major difference among the ResNet varieties. ResNet block significantly improves the achievement. The 18 layers of the ResNet are put under training by using 1.8 billion flops and 11.17 million parameters. The total filters have 64-512 output channels, from the convolution layers 2 through 5. Similarly, the ResNet have 3.8 and 4 flops and ResNet50 and ResNet152 employ 23.52 and 25.5 million trainable neurons, subsequently. The key difference between ResNet152 and ResNeXt50 is that ResNeXt50 applies 32 paths for cardinality, on the other hand ResNet152 does not. ResNeXt50 applies 128-2048 output channels for each filter, on the other hand ResNet152 uses 64-5102 the output of the filter channels.

3.2.2 MobileNetv2 model

MobileNet is a DNN which is best known for its use in lightweight applications [22]. In the study, the depthwise convolution concept has been used assist to reduce the fewer related parameters. Also, MobileNetV2 follow nearly a similar structure as V1, however, in the MobileBlock of V2 and the final 11 layers are engaged with somewhat different functions. In MobileNetV1, the function of the pointwise convolutional layer is to retain the channel number unchanged or double them, while, in V2, it is widely called a projection layer that assists to decrease the number of channels. Meanwhile, the projection layer assists to reduce the data all over the network, such layer is called a BottleNeck layer. A further concept supplemented in V2 is the residual connection of ResNet.

3.2.3 Xception model

Based on the inspiration of Inception network architecture, Google researcher workers have coined these DNN models with the addition of a novel concept named depthwise convolution operation. The depthwise convolution is a revised version of the prior depthwise convolutional layer. The prior version of the depthwise function initially performs channel-wise spatial convolutional after 11 convolutional operations. The recently developed depthwise convolution, initially, implements the 11 convolutions after the channelwise spatial convolution operation. During the Inception module, note the non-linearity followed by the initial operation is disregarded in the Xception module by presenting the depthwise convolution layer.

3.2.4 Hyperparameter tuning

The ESOA is used for adjusting the hyperparameters of the three DL models. Based on the Great Egret’s aggressive and the Snowy Egret’s sit-and-wait strategies [23], ESOA has integrated the benefits of the abovementioned strategies and constructs corresponding mathematical modelling to enumerate the behaviour. The ESOA is a parallel process with three building blocks: the discriminant condition, the sit-and-wait and the aggressive strategies. There exist 3 Egrets in a single Egret Squad (ES), Egret A uses a guiding forward method whereas Egret B and C correspondingly adapt encircling and random walking models as follows:

Sit-and-Wait Strategy

Assume that the location of thei^th ES is $x_{i} \in ℝ^{n},$ n indicates the problem dimension, and A (x_i) shows Snowy ES estimation technique of the prey in the present position. y denotes the approximation of prey in an existing position, ${\hat{y}}_{i} = A (x_{i}),$ (10)

Then the estimated technique is parameterized by, ${\hat{y}}_{j} = w_{i} \cdot x_{i},$ (11)

Where $w_{i} \in ℝ^{n}$ denotes the weight of the estimated technique. The error e_i is defined by, $e_{i} = ∥ {\hat{y}}_{j} - \frac{{y_{i} ∥}^{2}}{2} .$ (12)

In the meantime, ${\hat{g}}_{i} \in ℝ^{n}$ , the real-time gradient of ω_i, is recovered by taking the partial derivatives of w_j for error Equation (12), and the direction ${\hat{d}}_{j} .$ $\begin{array}{l} {\hat{g}}_{i} = \frac{\partial \hat{e}}{\partial w_{i}} \\ = \frac{\partial {‖ {\hat{y}}_{i} - y_{i} ‖}^{2} / 2}{\partial w_{i}} \\ = ({\hat{y}}_{i} - y_{i}) \cdot X i^{'} \\ {\hat{d}}_{i} = {\hat{g}}_{i} / | {\hat{g}}_{i} | . \end{array}$ (13)

Where Egrets represent best Egrets during prey behaviour, based on the practice of approximating prey behaviours and integrating their ideas. $d_{h, i} \in ℝ^{n}$ and $d_{g, i} \in ℝ^{n}$ indicates the directional association of the squad’s better position. $d_{h, i} = \frac{x_{ibest} - x_{i}}{| x_{ibesi} - x_{i} |} \cdot \frac{f_{ibest} - f_{i}}{| x_{ibesi} - x_{i} |} + d_{ibest}$ (14) $d_{g, i} = \frac{x_{gbesi} - X_{i}}{| x_{gbest} - X_{i} |} . \frac{f_{gbesi} - f_{i}}{| x_{gbest} - X_{i} |} + d_{gbest}$ (15)

The incorporated gradient $g_{i} \in ℝ^{n}$ is formulated as follows, and r_h ∈ [0, 0.5), r_g ∈ [0, 0.5): $g_{i} = (1 - r_{h} - r_{g}) \cdot {\hat{d}}_{i} + r_{h} \cdot d_{h, i} + r_{g} \cdot d_{g, i}$ (16)

The adaptive weight update technique is used, β₁ is 0.9 and β₂ is 0.99: $\begin{matrix} m_{i} = β_{1} \cdot m_{i} + (1 - β_{1}) \cdot g_{i} \\ v_{i} = β_{1} \cdot v_{i} + (1 - β_{1}) \cdot g_{i^{'}}^{2} \\ w_{i} = w_{i} - m_{i} / \sqrt{v_{i}} \end{matrix}$ (17)

Based on Egret A judgement of the present condition, the next sample position x_a,i is defined by, $x_{a, i} = x_{i} + {step}_{a} \cdot \exp (- \frac{t}{0.1 \cdot t_{\max}}) \cdot hop \cdot g_{i},$ (18)

$y_{a, i} = f (x_{a, i})$ (19)

Where t and t_max indicate the existing and maximal duration of iteration, whereas hop denotes the gap between the lower and upper boundaries of solution space. step_a ∈ (0, 1] represents the step size factors of Egret A. y_a,i represent the fitness of x_a,i.

Aggressive Strategy

Egret B attend to hunt for prey randomly and its behaviors are portrayed in the following: $x_{b, i} = x_{i} + {step}_{b} \cdot \tan (r_{b, i}) \cdot \frac{hop}{1 + t}$ (20) $y_{b, i} = f (x_{b, i})$ (21)

In Equation (20), r_b,i denotes the random value within (- π/2, π/2) , x_b,i represent the expected next location of Egret B and y_b,i shows fitness. Egret C represents the pursuit of prey aggressively hence the encircling model can be utilized as the update technique of its location:

$\begin{array}{l} D_{h} {= x}_{ibest} {- X}_{i^{'}} \\ D_{g} {= x}_{gbest} {- x}_{i^{'}} \end{array}$ (22) $\begin{matrix} x_{c, i} = (1 - r_{i} - r_{g}) \cdot x_{i} + r_{h} \cdot D_{h} + r_{g} \cdot D_{g} \\ y_{c, i} = f (x_{c, i}) \end{matrix}$ (23)

Where D_h indicates the gap matrix between the present and optimum location of the ES, D_g compared to the better position of each ES. x_c,i indicates the expected position of Egret C. step_b ∈ (0, 1] denotes the step size factor of Egret B. r_hand r_g are randomly generated integers within [0, 0.5).

Discriminant Condition

The squad chooses an optimum choice and together takes action once every ES member has decided on its plan. x_s,i denotes the solution matrix of i^th ES: $x_{s, i} = [x_{a, i} x_{b, i} x_{c, i}]$ (24) $y_{s, i} = [y_{a, i} y_{b, i} y_{c, i}]$ (25) $c_{i} = argmin (y_{s, i})$ (26)

$x_{i} = {\begin{matrix} x_{s, i} |_{c_{i}} & if y_{s, i} |_{c_{i}} < y_{j} or r < 0.3, \\ X_{i} & else \end{matrix}$ (27)

When the least value of y_s,i is more efficient than present fitness y_i, then Egret’s squad accept the choice. When the random integer r ∈ (0, 1) is lesser than 0.3, that implies there exists a 30% probability to be accepted as the worst plan.

The ESOA method presents a Fitness Function (FF) for managing increased efficiency of the classifier. It decides a positive integer to signify the improved effectiveness of the candidate outputs. In such cases, the lessening of the error rate of the classifier is assumed that FF is represented in Equation (28). $\begin{matrix} fitness (x_{i}) = ClassifierErrorRate (x_{i}) \\ = \frac{number of misclassified samples}{Total number of samples} * 100 \end{matrix}$ (28)

3.3 TB classification using WDLSTM model

The WDLSTM in this study is exploited for the classification of automated TB. LSTM is an alternative kind of ANNs with feedback connection, especially, a kind of Recurrent Neural Network (RNN) [24]. LSTM network is frequently employed for processing voice, video, or images. For instance, the LSTM network has been applied in human activity recognition, speech recognition, handwriting recognition, and language processing. In this work, LSTM networks comprise a memory cell and three gates or regulators for controlling the data flow inside the LSTM units comprised of input, output and forget gates. In another variant, the LSTM is a Gated Recurrent Unit (GRU) that has other gates. The LSTM process is equated as: $i_{t} = σ (W^{i} x_{t} * U^{i} h_{(t - 1)}$ (29) $f_{T} = σ (W^{f} x_{t} + U^{f} h_{τ - 1})$ (30) $o_{t} = σ (W^{o} x_{t} + U^{o} h_{t - 1})$ (31) $c_{t}^{'} = t a n h (W^{c} x_{τ} + U^{c} h_{t - 1})$ (32) $c_{t} = i_{t} \times c_{t^{'}} + f_{t} \times {c^{'}}_{t - 1}$ (33) $h_{t} = 0_{t} \times Tanh (c_{t})$ (34)

From the expression, σ represents the activation function, W and U denote the weight matrices, x_t shows the input vector at theT time step, c_t indicates the memory cell state, h_t denotes the existing hidden layer, and × signifies the component-wise multiplication. The memory cells keep the dependency between input features. Sigmoid and Tangent functions are the activation function of LSTM. WDLSTM is an LSTM-NN standardized through the drop-connect method, a conventional form of dropout where all the connections are dropped with probability 1 - p rather than dropping all the output units. In other words, drop-connect presents the dynamic sparsity to the network on the weight W, while dropout presents sparsity on the activation or the output vector of the network units. In WDLSTM, the hidden-to-hidden weight matrices (Uⁱ, U^f, U^o, andU^c) of the LSTMs are randomly dropped during the training to avoid over-fitting. The output of the LSTM gate is formulated by: $y_{t} = σ ({Wx}_{t} + (M \times U) h_{t - 1})$ (35)

In Equation (35), M indicates the binary matrix mask encoded connection data. All the elements in mask M are altered to demonstrate various connectivity across iterations during the training.

3.4 Parameter optimization

Finally, the AOA is used for the optimum parameter alteration of the WDLSTM approach. The AOA metaheuristic model is based on the population for optimizing solutions and adopting the arithmetical operator in mathematics and resolving issues without considering the derivative [25]. It should be noted that the optimization technique of a population-based model is inspired by the inspection and manipulation stages. The AOA summary is given as follows: initially, a set of candidate solutions attained randomly as X = [x1, 1, x2, 1, … x_N,n-1, x_N,n]. Next, Equation (36) is used to accelerate the local search for the optimum solution: $\begin{matrix} MOA (C_{iter}) = {MOA}_{\min} + C_{iter} \times (\frac{{MOA}_{\max} - {MOA}_{\min}}{M_{iter}}); \\ C_{iter} \in [1, M_{iter}] \end{matrix}$ (36)

Next, the AOA represent the multiple convergences to attain a broad range of the searching range for avoiding local solution by using the function of Metaheuristic Optimization Probability (MOP) and two arithmetical operators, Multiplication Operator (×) (MO) and Division Operator (÷) (DO) as follows: $x_{i, j} (C_{iter} + 1) = {\begin{matrix} best (x_{j}) / (MOP + ɛ) \times (({UB}_{j} - {LB}_{j}) \times μ + {LB}_{j}), & r_{2} < 0.5 \\ best (x_{j}) \times MOP \times (({UB}_{j} - {LB}_{j}) \times μ + {LB}_{j}) & otherwise \end{matrix}$ (37)

The integer μ, and MOP represents the control parameter to adjust the searching phase, and the optimization probability function. $MOP (C_{iter}) = 1 - \frac{C_{iter}^{\frac{1}{α}}}{M_{iter}^{\frac{1}{α}}} C_{iter} \in [1, M_{iter}]$ (38)

α represents the sensitive parameter which determines the accuracy of the operation stage.

It should be noted, if r2 < 0.5, then the initial DO initiate the inspection stage, and the MO stay insignificant until the DO completes its present task. Or else, the MO begins the inspection stage.

Lastly, during the manipulation stage, the AOA technique represents the accurate convergence to attain the enhancement of the solution attained in the inspection stage, at the same time, on the MOP function and the two arithmetical operators, Addition Operator (+) (AO) and Subtraction Operator (-) (SO).

$x_{i, j} (C_{iter} + 1) = {\begin{matrix} best (x_{j}) - (MOP + ɛ) \times (({UB}_{j} - {LB}_{j}) \times μ + {LB}_{j}), & r_{3} < 0.5 \\ best (x_{j}) + MOP \times (({UB}_{j} - {LB}_{j}) \times μ + {LB}_{j}) & otherwise \end{matrix}$ (39)

It should be noted, if r₃ < 0.5, then (-) initial SO begins the inspection stage, and (+) the AO stays insignificant until (-) the SO completes the present task. Or else, (+) the AO begins the inspection stage.

The selection of FF is one of a crucial factor of the AOA scheme. Solution encoding was implemented to assess the ability of the candidate solution. In this stage, the value of accuracy is the major cause exploited for scheming a FF. $Fitness = \max (P)$ (40) $P = \frac{TP}{TP + FP}$ (41)

In the above expression, TP and FP represents the true and false positive value.

4 Results and discussion

In this segment, the TB classification outputs of the ESOA-DFFTC model is investigated using the TB CXR data from the Kaggle dataset [26, 27]. This repository comprises 4200 instances with 3500 normal instances and 700 TB instances as illustrated in the Table 1. Figure 2 illustrates the instance images of normal and tuberculosis.

Fig. 2

Sample Images a) Normal b) Tuberculosis.

Table 1

Dataset specifics

Class name	Instances numbers
Normal	3500
TB	700
Total Instances	4200

The confusion matrix of the ESOA-DFFTC methodology on the TB classification method are illustrated in Fig. 3. The outputs depicted that the ESOA-DFFTC methodology attains effectual identification of TB under the total epochs. As a sample, with 200 epochs, the ESOA-DFFTC technique recognizes 3465 instances into normal and 642 instances into TB class. Simultaneously, with 1000 epochs, the ESOA-DFFTC technique recognizes 3494 instances into normal and 694 instances into TB class. Concurrently, with 2000 epochs, the ESOA-DFFTC method recognizes 3467 instances into normal and 664 instances into TB class.

Fig. 3

Confusion matrices of ESOA-DFFTC approach (a– j) Epochs 200– 2000.

In Table 2 and Fig. 4, a comprehensive TB classifier performance of the ESOA-DFFTC approach is tester under several epochs. The outputs report that the ESOA-DFFTC model explored TB and normal class.

Table 2

TB classification output of ESOA-DFFTC method under several epochs

Class	Accu _y	Sens _y	Spec _y	F _score	MCC
Epoch – 200
Normal	99.00	99.00	91.71	98.68	91.94
TB	91.71	91.71	99.00	93.25	91.94
Average	95.36	95.36	95.36	95.96	91.94
Epoch – 400
Normal	99.03	99.03	93.57	98.87	93.19
TB	93.57	93.57	99.03	94.31	93.19
Average	96.30	96.30	96.30	96.59	93.19
Epoch – 600
Normal	98.97	98.97	94.00	98.89	93.29
TB	94.00	94.00	98.97	94.40	93.29
Average	96.49	96.49	96.49	96.65	93.29
Epoch – 800
Normal	98.97	98.97	94.57	98.94	93.65
TB	94.57	94.57	98.97	94.71	93.65
Average	96.77	96.77	96.77	96.82	93.65
Epoch – 1000
Normal	99.83	99.83	99.14	99.83	98.97
TB	99.14	99.14	99.83	99.14	98.97
Average	99.49	99.49	99.49	99.49	98.97
Epoch – 1200
Normal	98.60	98.60	97.00	99.00	94.12
TB	97.00	97.00	98.60	95.10	94.12
Average	97.80	97.80	97.80	97.05	94.12
Epoch – 1400
Normal	99.80	99.80	99.43	99.84	99.06
TB	99.43	99.43	99.80	99.22	99.06
Average	99.61	99.61	99.61	99.53	99.06
Epoch – 1600
Normal	99.89	99.89	99.71	99.91	99.49
TB	99.71	99.71	99.89	99.57	99.49
Average	99.80	99.80	99.80	99.74	99.49
Epoch – 1800
Normal	99.14	99.14	96.43	99.21	95.30
TB	96.43	96.43	99.14	96.09	95.30
Average	97.79	97.79	97.79	97.65	95.30
Epoch – 2000
Normal	99.06	99.06	94.86	99.01	94.08
TB	94.86	94.86	99.06	95.06	94.08
Average	96.96	96.96	96.96	97.04	94.08

Fig. 4

Accu_y output of ESOA-DFFTC technique under several epochs.

With 200 epochs, the ESOA-DFFTC method obtains an average accu_bal of 95.36%, sens_y of 95.36%, spec_y of 95.36%, F_score of 95.96%, and MCC of 91.94%. Likewise, with 600 epochs, the ESOA-DFFTC method attains average accu_bal of 96.49%, sens_y of 96.49%, spec_y of 96.49%, F_score of 96.65%, and MCC of 93.29%. Similarly, with 2000 epochs, the ESOA-DFFTC method attains average accu_bal of 96.96%, sens_y of 96.96%, spec_y of 96.96%, F_score of 97.04%, and MCC of 94.08%.

The TLOS value and VLOS value of the ESOA-DFFTC methodology are examined on TB achievement in Fig. 5. The output demonstrates that the ESOA-DFFTC methodology has portrayed an improved achievement with the minimum TLOS value and VLOS value. The ESOA-DFFTC method has given an outcome in decreased VLOS outputs.

Fig. 5

TLOS and VLOS output of ESOA-DFFTC technique.

A precise Prec_n - Reca_l study of the ESOA-DFFTC technique under the testing dataset is stated in Fig. 6. The outputs depicted that the ESOA-DFFTC technique has given an outcome in improved Prec_n - Reca_l values under the total classes.

Fig. 6

Prec_n - Reca_l output of ESOA-DFFTC technique.

A short ROC study of the ESOA-DFFTC technique under the testing dataset is depicted in Fig. 7. The outputs depicted the ESOA-DFFTC technique has depicted its capacity in categorizing dissimilar classes.

Fig. 7

ROC curve output of ESOA-DFFTC methodology.

Table 3 shows relative accu_y testing of the ESOA-DFFTC model with recent methods [28]. The investigational values indicate that the ResNet-18 technique attains a lower accu_y of 96.53%. Then, the ResNet-50 and ResNet-101 methods accomplish slightly enhanced accu_y of 97.63% and 97.98% respectively. Meanwhile, the Inception-v3, VGG-19, and DenseNet-201 methods accomplish closer accu_y of 98.91%, 98.60%, and 98.32% respectively. But the ESOA-DFFTC technique gains a maximum accu_y of 99.80%.

Table 3

Accuracy analysis of ESOA-DFFTC methodology with recent systems

Methods	Accuracy
ResNet-l8 Model	96.53
ResNet-50 Model	97.63
ResNet-101 Model	97.98
Inception-V3 Model	98.91
Vgg-19 Model	98.60
DenseNet-201 Model	98.32
ESOA-DFFTC	99.80

Lastly, a brief Computation Time (CT) examination of the ESOA-DFFTC methodology is reported in Table 4. The table values demonstrate that the ESOA-DFFTC methodology reaches effectual outcomes with a minimal CT of 8.17 s. It is noticed that the ESOA-DFFTC method outperforms all the other models which attained increased CT values. The above-mentioned experimental result demonstrates that the ESOA-DFFTC method accomplishes the highest classification performance on TB diagnosis.

Table 4

CT analysis of ESOA-DFFTC approach\\ with current techniques

Models	CT (sec)
ResNetl8	23.20
ResNet50	25.10
ResNet101	15.90
InceptionV3	25.00
Vgg19	23.40
DenseNet201	20.30
ESOA-DFFTC	08.17

5 Conclusion

In the current research, a novel ESOA-DFFTC method is presented for automated TB classification on CXR images. This ESOA-DFFTC method exploited the concepts of feature fusion and hyperparameter tuning process for TB classification. Primarily, the ESOA-DFFTC technique utilized the GF approach for image-denoising purposes. Followed by, the ESOA-DFFTC technique carried out the feature fusion process using three DL models namely ResNeXt-50, MobileNetv2, and Xception. For enhancing the performance of the DL models, the ESOA-based hyperparameter optimizer is implemented in the study. For TB classification, the ESOA-DFFTC method uses an Arithmetic Optimization Algorithm (AOA) with Weight-Dropped Long Short-Term Memory (WDLSTM) method. The experimental output of the ESOA-DFFTC system is tested on a benchmark medical imaging database. A wide comparative investigation stated the greater achievement of the ESOA-DFFTC system over other current systems.

Declarations

Ethical approval

Not Applicable.

Competing interests

Authors confirm that they have no competing interest.

Authors’ contributions

Manivannan – Data Collection, Conceptualization

Manivannan – Data Curation, Investigation

Sathiamoorthy – Validation, Editing

Sathiamoorthy – Review and Editing

Funding

Not Applicable.

Availability of data and materials

Data can be given upon request.

References

Santosh

K.C.

, Allu

, Rajaraman

and Antani

, Advances in Deep Learning for Tuberculosis Screening Using Chest X-Rays: The Last 5 Years Review, Journal of Medical Systems 46(11) (2022), 1–19.

Rajaraman

, Folio

L.R.

, Dimperio

, Alderson

P.O.

and Antani

S.K.

, Improved semantic segmentation of tuberculosis— Consistent findings in chest x-rays using augmented training of modality-specific u-net models with weak localizations, Diagnostics 11(4) (2021), 616.

Devasia

, Goswami

, Lakshminarayanan

, Rajaram

, Adithan

, Bharanidharan

Deep Learning Classification of Active Tuberculosis Using Chest X-Rays: Efficacy of Transfer Learning and Generalization Performance of Cross-Population Datasets, (2022).

Sharma

, Gupta

, Kaur

A Deep Learning Approach for Tuberculosis Diagnosis from chest X-Rays: A Survey, (2021).

Rajakumar

M.P.

, Sonia

, Uma Maheswari

, Karuppiah

S.P

Tuberculosis detection in chest X-ray using Mayfly-algorithm optimized dual-deep-learning features, Journal of X-Ray Science and Technology (Preprint) (2021), 1–14.

Kotei

, Thirunavukarasu

Ensemble Technique Coupled with Deep Transfer Learning Framework for Automatic Detection of Tuberculosis from Chest X-ray Radiographs, In Healthcare (Vol. 10, No. 11, p. 2335). MDPI. 2022, November.

Dasanayaka

and Dissanayake

M.B.

, Deep learning methods for screening pulmonary tuberculosis using chest x-rays, Visualization 9(1) (2021), 39–49.

Zaidi

S.Z.Y.

, Akram

M.U.

, Jameel

and Alghamdi

N.S.

, A deep learning approach for the classification of TB from NIH CXR dataset, IET Image Processing 16(3) (2022), 787–796.

Simi Margarat

, Hemalatha

, Mishra

, Shaheen

, Maheswari

, Tamijeselvan

, Pavan Kumar

, Banupriya

and Ferede

A.W.

, Early Diagnosis of Tuberculosis Using Deep Learning Approach for IOT Based Healthcare Applications, Computational Intelligence and Neuroscience 2022 (2022).

10.

Liu

, Qin

, Liu

, Yang

Improving Tuberculosis Recognition on Bone-Suppressed Chest X-Rays Guided by Task-Specific Features, In International Workshop on PRedictive Intelligence In MEdicine (pp. 59–69). Springer, Cham, 2021, October.

11.

and Yuan

, Convolution Neural Network With Coordinate Attention for the Automatic Detection of Pulmonary Tuberculosis Images on Chest X-Rays, IEEE Access 10 (2022), 86710–86717.

12.

Urooj

, Suchitra

, Krishnasamy

, Sharma

and Pathak

, Stochastic Learning-Based Artificial Neural Network Model for an Automatic Tuberculosis Detection System Using Chest X-Ray Images, IEEE Access 10 (2022), 103632–103643.

13.

Rahman

, Cao

, Li

, Sun

, Hao

A hybrid architecture of DenseNet201 and XGBoost to detect tuberculosis from chest x-ray, In International Symposium on Artificial Intelligence and Robotics 2021 (Vol. 11884, pp. 583–594). SPIE. 2021, October.

14.

Zaidi

S.Z.Y.

, Jameel

, Akram

M.U.

Impact ofTransfer Learning on Chest X-Ray (CXR) Images for Tubercu losis Classification, In 2021 International Conference on Robotics and Automation in Industry (ICRAI) (pp. 1–8). IEEE. 2021, October.

15.

Raju

, Aswath

, Kadam

, Pagidimarri

Automatic detection of tuberculosis using deep learning methods, In Advances in Analytics and Applications (pp. 119–129). Springer, Singapore, 2019.

16.

Singh

, Hamde

Tuberculosis detection using shape and texture features of chest X-rays, In Innovations in Electronics and Communication 493 Engineering (pp. 43–50). Springer, Singapore, 2019.

17.

Ignatius

J.L.P.

, Selvakumar

, Paul

K.G.J.L.

, Kailash

A.B.

, Keertivaas

and Prajan

S.A.R.

, Histogram Matched Chest X-Rays Based Tuberculosis Detection Using CNN, Computer Systems Science And Engineering 44(1) (2023), 81–97.

18.

Iqbal

, Usman

and Ahmed

, An efficient deep learning-based framework for tuberculosis detection using chest X-ray images, Tuberculosis 136 (2022), 102234.

19.

Sharma

and Mishra

P.K.

, Image enhancement techniques on deep learning approaches for automated diagnosis of COVID-19 features using CXR images, Multimedia Tools and Applications 81(29) (2022), 42649–42690.

20.

Hilal

A.M.

, Al-Wesabi

F.N.

, Alzahrani

K.J.

, Al Duhayyim

, Ahmed Hamza

, Rizwanullah

, García Díaz

Deep transfer learning based fusion model for environmental remote sensing image classification model, European Journal of Remote Sensing (2022), 1–12.

21.

Kashyap

, Breast cancer histopathological image classification using stochastic dilated residual ghost model, International Journal of Information Retrieval Research (IJIRR) 12(1) (2022), 1–24.

22.

Hazarika

R.A.

, Kandar

and Maji

A.K.

, An experimental analysis of different deep learning based models for Alzheimer’s disease classification using brain magnetic resonance images, Journal of King Saud University-Computer and Information Sciences 34(10) (2022), 8576–8598.

23.

Chen

, Francis

, Li

, Liao

, Xiao

, Ha

T.T.

, Li

, Ding

and Cao

, Egret Swarm Optimization Algorithm: An Evolutionary Computation Approach for Model Free Optimization, Biomimetics 7(4) (2022), 144.

24.

Wibowo

, Weight-Dropped Long Short Term Memory Network for Stock Prediction with Integrated Historical and Textual Data, IAENG International Journal of Computer Science 47(3) (2020).

25.

Tahiri

M.A.

, Karmouni

, Bencherqui

, Daoui

, Sayyouri

, Qjidaa

, Hosny

K.M.

New color image encryption using hybrid optimization algorithm and Krawtchouk fractional transformations, The Visual Computer (2022), 1–26.

26.

https://www.kaggle.com/datasets/tawsifurrahman/tuberculosis-tb-chest-xray-dataset.

27.

Tawsifur Rahman , Amith Khandakar , Muhammad Kadir

, Khandaker Islam

, Zaid Mahbub,

, Mohamed Arselene Ayari and Muhammad Chowdhury

E.H.

, Reliable Tuberculosis Detection using Chest X-ray with Deep Learning, Segmentation and Visualization, IEEE Access 8 (2020), 191586–191601. DOI: 10.1109/ACCESS.2020.3031384.

28.

Rahman

, Khandakar

, Kadir

M.A.

, Islam

K.R.

, Islam

K.F.

, Mazhar

, Hamid

, Islam

M.T.

, Kashem

, Mahbub

Z.B.

and Ayari

M.A.

, Reliable tuberculosis detection using chest X-ray with deep learning, segmentation and visualization, IEEE Access 8 (2020), 191586–191601.