Fr-WCSO- DRN: Fractional Water Cycle Swarm Optimizer-Based Deep Residual Network for Pulmonary Abnormality Detection from Respiratory Sound Signals

Respiratory sounds disclose significant information regarding the lungs of patients. Numerous methods are developed for analyzing the lung sounds. However, clinical approaches require qualified pulmonologists to diagnose such kind of signals appropriately and are also time consuming. Hence, an efficient Fractional Water Cycle Swarm Optimizer-based Deep Residual Network (Fr-WCSO-based DRN) is developed in this research for detecting the pulmonary abnormalities using respiratory sounds signals. The proposed Fr-WCSO is newly designed by the incorporation of Fractional Calculus (FC) and Water Cycle Swarm Optimizer WCSO. Meanwhile, WCSO is the combination of Water Cycle Algorithm (WCA) with Competitive Swarm Optimizer (CSO). The respiratory input sound signals are pre-processed and the important features needed for the further processing are effectively extracted. With the extracted features, data augmentation is carried out for minimizing the over fitting issues for improving the overall detection performance. Once data augmentation is done, feature selection is performed using proposed Fr-WCSO algorithm. Finally, pulmonary abnormality detection is performed using DRN where the training procedure of DRN is performed using the developed Fr-WCSO algorithm. The developed method achieved superior performance by considering the evaluation measures, namely True Positive Rate (TPR), True Negative Rate (TNR) and testing accuracy with the values of 0.963, 0.932, and 0.948, respectively. Moreover, the testing accuracy value achieved by the Random Forest classifier, machine learning, DNN, CNN, WCSO-based HAN, and developed Fr-WCSO-based DRN is 0.753, 0.797, 0.844, 0.887, 0.929, and 0.948. While analyzing the results that are tabulated, it is clear that the developed Fr-WCSO-based DRN computed a higher TPR of 0.963, higher TNR of 0.932 using dataset-1, and higher testing accuracy of 0.948 using dataset-2, respectively. The effectual results are obtained as the model is well trained with the proposed Fr-WCSO and hence increasing the learning rate of the Deep Residual Network.


Introduction
Nowadays, world's population is suffering from different kinds of respiratory disorders [1]. According to the report of World Health Organization (WHO), the various five respiratory diseases, such as asthma, tuberculosis, acute lower respiratory tract infection (LRTI), lung cancer, and chronic obstructive pulmonary disease (COPD) may lead to the demise of more than 3 million people around the world [2]. In addition, people are getting out to particulate issue while working in defectively ventilated work areas and also come in contact with tiny substances coming from vehicles, toxic gaseous emission from the chimneys of industries and chemical plants. All these factors are responsible for the growth of lung-related diseases [1,3]. Respiratory sounds coming from airways and lungs offer imperative information about their pathologies and physiologies [4]. Moreover, the respiratory sounds are generally classified into two classes, namely abnormal sound and normal sound with respect to the acoustic respiratory lung sound coming from lungs [5]. Furthermore, various kinds of respiratory problems are progressive in nature and hence worsen the respiratory potential of the human lungs. Inspite of the phenomenal advancements in the areas of medical science, identification of various diseases associated to lungs at premature stage is still a topic of concern. The abnormalities coming from the lungs are utilized for distinguishing abnormal and normal pulmonary sounds [6].
Pulmonary sounds are generally classified into adventitious sounds and normal breath sounds and are heard on the chest wall and mouth. Breath sounds which are considered as normal respiratory noises are synchronous with the flow of air through the airways from laminar to turbulent with a frequency spectrum of 200-600 Hz in powerful and healthy lungs. Besides, the crackles with a frequency spectrum of 100-2000 Hz and a time period of less than 70 ms represent an important element of adventitious sounds. They are frequently attributed to the bubbling of secretions in the airways or to the impulsive alterations in gas pressure in the minute airways. Moreover, the crackles are generally categorized as coarse and fine. In pulmonary sounds, timing, pitch, and number of crackles reflect the type and phase of disease [7]. For the patients with bronchiectasis and chronic air flow obstruction, short pitched crackles called coarse crackles are created while the crackles of interstitial fibrosis are fine or high pitched and take place in a mid to late inspiration. The application of signal processing techniques to the data attained through the stethoscope enabled the analysis of lung sounds simpler, objective and also makes the noninvasive auscultation approach more valuable in the pulmonary disease analysis. Recently, numerous researchers have been conducted to acquire the parametric representations of lung sounds for creating a more objective basis for their assessment [8].
With the expansion of respiratory lung disorders [9], more consideration is being paid on the risk-free diagnosis based on assessment of lung sounds which contains plentiful information about the lung condition. However, the auscultation of lung sounds has a very restricted practice due to the insufficiency of information it can attain, its reliance on physician's knowledge, and its improved rate of misdiagnosis. Hence, it is necessary to establish a proficient technique for identifying lung sounds and also for analyzing related diseases based on the investigation of the more useful lung sounds composed by electronic stethoscope and expands an effective way for identifying lung sounds and its associated diseases [10]. During diagnosis practice, it is supportive for precisely judging the characteristic vectors and for evaluating the disease reasons to de-noise abstract vibration frequency and lung sounds, amplitude gradient and sound waves amplitude [10]. Moreover, clinical respiratory sounds are complicated to get in practice, and the samples of lung sounds are small. Thus, the researchers have normally selected conventional machine learning techniques, such as hidden Markov model (HMM) [11], support vector machine (SVM) [12] artificial neural network (ANN) [13], and k-Nearest Neighbor [14], instead of a deep learning technique for categorizing the lung sounds [8].
The key objective of this research is to introduce a robust technique for pulmonary abnormality detection named FrWCSO algorithm. The respiratory sound signal is preprocessed and the features, such as Bark Frequency Cepstral Coefficient (BFCC), short-term features, statistical features, and wavelet transform are extracted to achieve pulmonary abnormality detection. Here, the data augmentation process is performed using Window Warping (WW), jittering, and cropping. Once data augmentation is done, feature selection is performed using proposed FrWCSO. Moreover, the DRN classifier is accomplished for pulmonary abnormality detection and the training practice of DRN is carried out using the proposed optimization algorithm, named Fr-WCSO algorithm, which is the hybridization of FC and WCSO. On the other hand, WCSO is the integration of WCA and CSO, respectively.
Data set 1 The performance improvement achieved by the developed method in comparison with the existing methods is 21.265%, 15.400%, 11.248%, 7.705%, and 2.066. Figure  The major goal of the research work is elucidated as follows: • Developed Fr-WCSO-based DRN: an effective and robust optimization algorithm, called Fr-WCSO is developed for feature selection and also for training the DRN classifier for attaining better pulmonary abnormality detection results. However, Fr-WCSO is the incorporation of FC and WCSO, which is the integration of WCA and CSO.
The arrangement of the research paper is prepared as follows, Section 2 reviews the exiting pulmonary abnormality detection methods, the proposed pulmonary abnormality detection model is portrayed in Section 3, the implementation outcomes are presented in Section 4, and the conclusion of the paper is described in Section 5.

Motivations
The various existing pulmonary abnormality detection techniques based on the respiratory sound signals along with its advantages and disadvantages are illustrated in this section that encourage the researchers to design the proposed Fr-WCSO-based DRN method.

Literature Survey
This section reviews the various eight existing pulmonary abnormality detection methods with its advantages and disadvantages.
Alfonso Monaco et al. [7] introduced a multi-time-scale machine learning model to categorize the respiratory sounds like crackles and wheezes. This model comprises three different modules, such as data standardization, multi-timescale feature extraction, and classification module. This technique was utilized for discriminating the well controls from the patients with respiratory disorders. This model was more appropriate for large-scale applications, but this method failed to identify the important sounds at the respiratory cycle level. Fei Meng et al. [8] developed a machine learning method to recognize the respiratory sounds using wavelet coefficients. This method integrated the idea of wavelet signal similarity with relative wavelet energy and the wavelet entropy. This approach attained improved classification accuracy. On the other hand, the normalization module may lead to an increase in errors. Jyotibdha Acharyay, and Arindam Basu [15] presented a Deep Convolutional Neural Network-Recurrent Neural Network (CNN-RNN) model to classify the respiratory sounds using Mel spectrograms. Here, a local log quantization of trained weights is accomplished for minimizing the memory requirements. This model obtained substantial weight compression without any adjustments in the architectural model. Meanwhile, this technique failed to reduce the operational cost. Samiul Based Shuvo et al. [2] introduced a light weight CNN model for categorizing the respiratory diseases with hybrid scalogram features. This model achieved improved accuracy in categorizing the ternary chronic diseases, but the major challenge lies in utilizing this approach in the real-world clinical applications.
Fraiwan et al. [16] introduced a deep learning framework using CNN and bidirectional long short-term memory (LSTM) for classifying the respiratory sounds. This deep learning model achieved improved performance in categorizing the lung sounds. Meanwhile, the training structure and the pre-processing techniques were not adjusted to achieve effective results. Neeraj Baghel et al. [17] designed an automatic classification approach based on machine learning for diagnosing multiple pulmonary diseases from lung sounds. This method effectively minimized the computational time in treating the pulmonary diseases and this method was more appropriate for real-time applications. However, this method failed to reduce the computational overheads. Khan et al. [18] modeled an empirical mode decomposition (EMD) method for extracting the intrinsic mode functions (IMFs) of lung sounds because of its non-linear and non-stationary nature. However, this method failed to consider novel deep learning classifiers for improving the performance of classification. Jayalakshmy and Gnanou Florence Sudha [19] presented a pre-trained optimized Alexnet CNN model to predict the respiratory disorders. This model achieved enhanced performance in classifying the respiratory disorders, thereby preventing ambiguous analysis. Moreover, this method failed to consider larger and heterogeneous datasets for effective classification results.

Challenges
The several issues faced by various abnormality classification techniques based on the respiratory sound signals are elucidated as follows, • The major constraint in multi-scale machine learning model introduced in [7] is the lack of capability for detecting the major sounds at the respiratory cycle phase. In addition, the deep learning methods, namely Long Short-Term Memory (LSTM) and ResNet were not considered for enhancing the classification performance.
• The machine learning method developed in [8] attained superior performance, but this method failed to incorporate the respiratory sounds with different medical parameters, like spirometry parameters for providing intellectual disease recognition model. • In [16], deep learning method was designed for categorizing the respiratory sounds. However, this work can be further extended by enlarging the dimensions of the dataset for including numerous subjects and a wider collection of diseases. • An automatic classification approach was introduced for diagnosing respiratory problems, but this method failed to recognize other types of lung sounds associated to various respiratory disorders [17]. • In [19], an adaptive kernel selection algorithm was designed to classify the lung sounds of humans. This algorithm effectively classified the normal and abnormal lung sounds. However, this algorithm does not classify the biomedical sound signals, namely phonocardiogram and bowel sound in the abnormal and normal category.

Proposed Fr-WCSO-Based DRN for Pulmonary Abnormality Detection
This section illustrates the process of designing and developing an approach for the detection of pulmonary abnormality with the respiratory Sound signal. The proposed technique integrates the advantages such as fast convergence rate, avoidance of trapping at local minima and the maintenance of exploration and exploitation phase for obtaining enhanced performance outcomes. The various phases involved in detecting the pulmonary abnormalities are, pre-processing phase, feature extraction phase, data augmentation phase, feature selection phase, and detection phase. Initially, the input respiratory sound signal is fed to the pre-processing phase, which is done based on Hanning window [20] and spectral gating-based noise reduction method [21] for removing the unwanted distortions and external calamities present in the input signal. Once the pre-processing is done, the feature extraction process is carried out for extracting the features, such as BFCC [22] and short-term features like spectral flux, spectral centroid, and Power Spectral density (PSD) [23]. In addition, the statistical features, like mean, standard deviation, entropy, energy, kurtosis, and wavelet transform [24] are extracted effectively from the input signal. Once the appropriate features are extracted, the data augmentation is done for enlarging the dimensions of the dataset. After that, the feature selection process is performed using the proposed FrWCSO. Moreover, the proposed Fr-WCSO is newly designed by integrating FC [25] and WCSO algorithm. Meanwhile, WCSO is the combination of WCA [26] and CSO [27]. Finally, the pulmonary abnormality detection is performed using DRN [28] that is trained using the proposed FrWCSO for achieving effective detection outcomes. Figure 1 depicts the schematic view of the developed Fr-WCSO-based DRN for pulmonary abnormality detection.

Input Acquisition
At first, the input respiratory sound signal is acquired from a dataset S with n number of signals, which is given as, where, n represents the overall respiratory sound signals, and H g denotes the g input signal, which is fed as an input to the pre-processing phase.

Pre-processing
Once the input respiratory sound signals are acquired, preprocessing is performed using Hanning window [20] and Spectral gating-based noise reduction technique [21] to remove the distortions from the signals. The pre-processing phase is more effectual in making the further phases more viable. The Hanning window and Spectral gating-based noise reduction technique is illustrated below as follows,

Hanning Window
Hanning window [20] or Hann window is an approach applied for removing the noises or distortions from the input respiratory sound signals. This type of technique is employed as a pre-analysis module for improving the results. The infinite streams of the sound signals are transformed to incessant streams of blocks of samples known as frames. The frames extracted from the respiratory sound signals are windowed using Hanning window technique to eliminate the unnecessary signals for collecting the valuable information. This approach enables the smoothing without losing any contents present at the sharp points or edges. Moreover, the Hann window function is represented as, where, H(Z) denotes the Hann window function.

Spectral Gating-Based Noise Reduction Method
The spectral gating-based noise reduction method [21] is an algorithm accomplished for removing the unnecessary noises from the signals. The various steps of spectral gating-based noise reduction algorithm are presented below as follows, Step 1. A Fast Fourier Transform (FFT) is measured over the noise audio clip.
Step 2. Statistics are measured over FFT of the noise relating to frequency. Step 3. A value of threshold is computed with respect to the noisy statistics and the chosen sensitivity of the algorithm.
Step 4. An FFT is measured over the signal.
Step 5. A mask is evaluated by comparing the FFT of the signal to the value of threshold. Step 6. The smoothing of mask with a filter is done in terms of time and frequency.
Step 7. The mask is applied to the FFT of the signal, and finally the signal is inverted Accordingly, the pre-processed output generated using the Hanning window and the Spectral gating-based noise reduction technique is denoted as P * p , which is given to the feature extraction phase.

Feature Extraction
Once the pre-processing is done, the feature extraction process is carried out to extract the significant features as the feature extraction step is very much significant for detecting

Feature extraction
Pre-processing

BFCC
The BFCC [22] representation is used for extracting the significant features from input respiratory sound signal. BFCC distorts power spectrum so that it matches with individual intensity observations. The frequency bands of BFCC are linear to 500 Hz and the equation is expressed as, where, Bark denotes the bark frequency, and c specifies the frequency in Hertz. The BFCC feature output is represented as f 1 with the dimension [920 × 200].

Short-Term Features
Some of the short-term features, like spectral flux, spectral centroid, and PSD are illustrated below as follows, (a) Spectral flux Spectral flux is a feature which computes the rate of spectral information in the respiratory sound signal. The information is measured based on the frequency-driven parameters so that the difference value among the successive spectral frames is also computed. The equation for spectral flux feature is given as, where, the term J r (U) indicates the spectrum value of the respiratory sound signal, and t 1 denotes the spectral flux feature with the dimension [920 × 1].
(b) Spectral centroid Spectral centroid feature defines the spectrum center gravity and it measures the spectral values and higher centroid values related to enhanced signal frequency. Moreover, this feature provides the information regarding the signal variations and is expressed as, where,B l (s) denotes the short-time Fourier transform, T(s) indicates the frequency of sth frame, the frame length is represented as , and the spectral centroid feature is signified as t 2 with the dimension [920 × 1].
(c) PSD PSD [23] is a feature which is used for stationary signal processing and is more appropriate for narrowband signals. It is a general signal processing method that distributes the signal power over frequency and exhibits the strength of the energy as the frequency function. However, the equation for PSD is expressed as, where,Nyq indicates the Nyquist frequency, ℵ denotes the frequency in hertz, and the PSD feature is specified as t 3 with the dimension [920 × 12].
The extracted features, namely spectral flux, spectral centroid, and PSD are integrated to form a short-term feature f 2 .

Statistical Features
Some of the statistical features extracted in the pulmonary abnormality detection process are, mean, standard deviation, entropy, energy, and kurtosis and are explained below as follows, (a) Mean It is the feature that defines the average of sum of all the vector values and is expressed as, where, W r signifies vector values of every class, d denotes the total vector values of pre-processed signal, and mean feature is denoted as s 1 having the dimension [920 × 1].
(b) Standard deviation Standard deviation is a feature that quantifies the total variations in the vector values, which is signified as where, X i signifies each of the values of the data, X indicates mean of X i , and n indicates the total count of data points.
(c) Entropy Entropy is a measure which computes the signal vectors with higher amount of information. The information at the corner and edge is measured using the entropy feature s 3 having the dimension [920 × 1].
where, p i refers to the probability (obtained from the normalized histogram of the image) associated with the graylevel, i, and the range 0-255 is the pixel intensity value.
(d) Energy Energy feature of a pre-processed signal is computed by summing the vector values present in the signal and is denoted as with the dimension [920 × 1] . The extracted feature is represented as s 4 where x(t) is a real.
(e) Kurtosis Kurtosis defines the sharpness of curve in a frequency distribution, which is represented as where, Y i is the ith variable of the distribution, Y is the mean of the distribution, and n indicates the number of variables in the distribution.
The features extracted using statistical features, like mean, standard deviation, entropy, energy, and kurtosis are incorporated together to form a feature f 3 .

Wavelet Transform Feature
A wavelet [24] is a wave-like vibration accompanied by amplitude that increases, decreases, and repeats around zero. Wavelets have valuable and practical features for signal processing. Wavelets are also useful for wavelet-based decompression algorithms designed to reduce loss and restore unprocessed information. In addition, wavelet transform can be considered as a time-frequency representation of distinctive signals that are utilized to model systems, signals and processes through an incorporation of numerous wavelet functions. Moreover, the wavelet transform is represented as an arbitrary waveform after being scaled. To perform scaling, one small waveform is utilized as a pattern to be enlarged, transited, and reduced. Thus, the wavelet feature is indicated as f 4 with the dimension [920 × 200].
Once all the significant features are extracted, the extracted features are incorporated to generate a final feature vector output represented as F * e . (10) where, f 1 signifies the BFCC feature, the short-term features are represented as f 2 , f 3 specifies the statistical feature, f 4 denotes the wavelet transform feature, and F * e indicates the feature extraction output with the dimension [920 × 419] , which is then presented as an input to the data augmentation.

Data Augmentation
Data augmentation defines a process that augments the dataset by establishing unobserved samples. The major aim of data augmentation is done to improve the pulmonary abnormality detection performance and also in enlarging the dimensions of the dataset. Lesser number of training samples may lead to over fitting. To avoid over fitting problems during training, the following data augmentation methods, such as WW, jittering [29], and cropping are used and are described below as follows, (a) Window warping (WW) WW [29] is a time-series-specific approach, which aims at maximizing the quantity and the diversity of the data. It selects a slice of the time series at random using a sliding window in and warps it by dilating or squeezing procedure. The choice of the warping area is significant for the performance of the WW, because the time scale has important physiological meanings. Hence, the WW method is two-sided from time to time. It may establish non-redundant samples to the dataset with accurate parameter selection. However, if the warping area or warping ratio is not carefully chosen, the generated new samples may devastate the physiological importance and the detection performance. (b) Jittering Jittering [29] is a scheme that adds jitter on the signal recordings at random to imitate the noise. In addition, Jittering adds unnecessary noise to the recordings which are previously dicarded from baseline or noise. (c) Cropping Cropping is done to crop or select the specific region from the signals.
As a result, the data augmentation result generated using WW, Jittering, cropping method is signified as A * p with the dimension [3680 × 419] , which is fed as an input to the feature selection phase.

Feature Selection
Reducing the overall features is important in addressing the computational complexity problems. Feature selection process is performed using the proposed FrWCSO algorithm. Here, the developed FrWCSO algorithm trains the classifier weights to achieve effective optimal solution. The hybridization WCSO with the FC algorithm exhibits the efficiency of the developed scheme by minimizing the computational complexity issues in an enhanced way. Moreover, the representation of solution encoding and the computation of fitness measure are described below as follows, Solution encoding Solution encoding is a depiction of solution vector that selects the optimal features to achieve optimal solution. The dimension of solution is given as [1 × w] , where w denotes total number of features to be selected. Figure 2 portrays solution encoding.
Fitness measure The fitness measure is used for identifying the optimal solution by computing the optimal value of fitness and the equation for fitness measure is given as, where, RV and CS signifies the RV coefficient and cosine similarity between ith feature and class label.
The algorithmic phases of the developed FrWCSO algorithm is elucidated below as follows, (a) Initialization The population is initialized with N number of raindrops, and is given as, where, N signifies the overall raindrop solutions, and Y k denotes the kth raindrop.
(b) Calculate fitness measure The fitness measure is computed by the equation expressed in (11).
(c) Compute the cost function The decision variables Y 1 , Y 2 , ..., Y N var can be indicated as the floating-point values or can be represented as a predefined set for the discrete and continuous limits. The cost of a raindrop is computed using the cost function evaluation which is given as, where, the total number of raindrops is signified as N pop and N var represents the total number of design values. For the where, N q denotes the streams passing to particular sea or rivers.
(e) Evaluate the flow of stream to the river or sea The streams are formed from every raindrop, and then join with each other to produce fresh rivers. Besides, the streams run to the sea, and every stream and river connects in sea. The streams passes to the river along the linking line between them based on the randomly selected distance and the equation is expressed as, where, the value of A lies within the range of 1 and 2 , the present distance among the river and stream is signified as k . The value of A being superior to one enables the streams to pass in various directions over the rivers. In addition, this kind of rule can be utilized in passing rivers to sea. Hence, the latest position of rivers and streams is expressed as, (14) N sr = Total number of rivers + 1 ⏟⏟⏟ where, rand denotes the uniformly distributed random number between 0 and , Q 1 , Q 2 , Q 3 ∈ [0, 1] n , X z denotes the velocity of loser, Y z x implies the location of winner, the value of N lies among the range 1 and 2, represents the parameter that controls the influence of Q(u) , the mean position value of every particles in (u) is signified as Q(u) , and ℏ lies between the range 1 and 2, (f) Determine the condition for evaporation Evaporation is the key factor that protects the process from immature convergence. Here, the clouds are created by taking the evaporated water into the atmospheric region, and thereafter condenses in the colder atmosphere so that the water gets released back in the form of rain to the earth. The rain drops create new streams passing to rivers, which then flows into the sea. Moreover, h max represents the minimum number closer to 0 and the value of h max adaptively decreases as, (g) Raining procedure After the completion of evaporation process, the raining process is performed. Here, the new raindrops form the streams in various places, and the position of the newly formed streams is expressed as, where, lb denotes lower bound, and ub implies the upper bounds.
where, signifies the coefficient that represents the searching area limit closer to the sea and the value is set to 0.1 , rand k implies the frequently distributed random number.
(h) Feasibility assessment The feasibility evaluation is done for finding the optimal value. If the newly obtained solution has the optimal value, then the existing one is replaced with the optimal value.
(i) Termination All the above-mentioned phases are iteratively performed until the finest solution is achieved. Table 1 presents the pseudo code of proposed Fr-WCSO algorithm.
As a result, the feature selection output is denoted as with dimension [3680 × 100] , and is fed as an input to the data augmentation phase.

Proposed Fr-WCSO-Based DRN for Pulmonary Abnormality Detection
The feature selection output F * p is subjected as an input to the final phase called pulmonary abnormality detection. Here, the DRN classifier [28] is utilized for detecting the pulmonary abnormalities as this classifier achieves better generalization performance and accuracy. Meanwhile, the training practice of DRN classifier is performed using the developed Fr-WCSO algorithm. Furthermore, the architectural representation and training process of DRN are described below as follows.

Architecture of DRN
The DRN model [28] comprises of various layers, such as input layer, convolutional (Conv) layer, batch normalization, activation function, average pooling layer, flatten layer, and dense layer. Here, each layer considers the output of the previous layer to execute the further process.
Conv layer During the training process, the two-dimensional conv layer minimizes the free parameters and the input is processed using a sequence of filters known as kernel with local connections. The mathematical operation of conv layer is a dot product of input and the kernel for sliding every filter on the input matrix. However, the computation process of conv layer is expressed as, where, d signifies the 2-D convolutional output from the preceding layer, i and j records the coordinates, L implies u × u kernel matrix, and w and h indicates kernel matrix index. Hence, L b signifies the size of the kernel for bth input neuron, and * symbolizes the cross-correlation operator.
Average pooling layer The pooling layer is generally attached into the conv layers, and is operated on every slice and depth of feature maps for minimizing the spatial dimensions of feature maps to control the over fitting problems.
where, the height and width of input matrix is signified as h in and w in , whereas h out and w out are the output. In addition, the height and width of the kernel size is represented as b h and b w .
Activation function To enhance the non-linearity of the extracted features, non-linear activation function known as Rectified Linear Unit (ReLU) is utilized by learning the nonlinear and complicated features. However, the ReLU function is expressed as, Repeat the process 25 end for 26 end where, d represents the inputs feature. Residual blocks Residual blocks possess a shortcut association from input to output so that the inputs are connected to the output when the output and input are of equal size. Meanwhile, the dimension matching factor is accomplished to match the dimensions of input and output.
where, d signifies the input of residual blocks, x denotes the output of residual blocks, ℜ indicates the mapping relationship among the input and the output dimension, and W d specifies the dimension matching factor.
Linear classifier Linear classifier is constructed based on the integration of SoftMax function and fully connected (FC) layer. Here, each neuron is connected from one layer to other layer using FC layer, whereas the SoftMax function is utilized for normalizing the input vector to probability vector in such a way that the class having highest probability is selected as the final computed output.
where, W O×P represents the weight matrix with O × P dimension, e denotes the bias, d P×Q signifies the input feature map with P × Q dimension, implies the SoftMax function, d l specifies the element of output layer, and 0 indicates the output dimension. Figure 3 presents the architectural representation of DRN.

Training Process of DRN Using Proposed FrWCSO
The DRN model is trained using the proposed optimization algorithm, named Fr-WCSO. However, the developed Fr-WCSO is designed by the combination of FC [25] and WCSO which is the hybridization of WCA [26] and CSO [27], respectively. Furthermore, the fitness equation employed for the DRN classifier is expressed below as, where, the fitness measure is represented as , implies the overall training samples, the output of DRN classifier is denoted as D p and O specifies the target output. The algorithmic phases of developed FrWCSO algorithm are illustrated in the above Section 3.5. Thus, the proposed Fr-WCSObased DRN method of pulmonary abnormality detection is very effective in detecting the pulmonary abnormalities as normal and abnormal, which is then offered to the applications for the performance enhancement.

Results and Discussion
The implementation results of the proposed Fr-WCSO-based DRN considering the evaluation metrics, such as TPR, TNR, and testing accuracy is presented in this section.

Experimental Set-Up
The developed Fr-WCSO-based DRN is implemented in a PYTHON tool using International Conference in Biomedical and Health Informatics (ICBHI 2017) (dataset-1) [30] and respiratory sound database (dataset-2) [31] with the PC having Intel i3 core processor, 2 GB RAM, and windows 10 OS.

Dataset Description
This section explains the two different datasets utilized for the pulmonary abnormality detection. Dataset 1 This dataset-1 [30] includes the total recording time of 5.5 h with 6898 respiratory phases, of which 886 have wheezes, 1864 have crackles, and 506 enclose crackles and wheezes in 920 annotated audio samples from different 126 subjects. Meanwhile, cycles are annotated by respiratory experts containing wheezes, crackles, and a mix of wheezes and crackles, or without anomalous respiratory sounds. In addition, heterogeneous equipment's are utilized for gathering the signals with the duration of 10-90 s. Moreover, the exact location of the chest from where the recordings are composed is also provided. Furthermore, the file names are classified into 5 various elements, such as recording index, chest location, patient number, recording equipment, and acquisition mode are separated with underscores (_).
Dataset 2 This dataset-2 [31] was introduced by two research teams from Greece and Portugal. It comprises 920 annotated recordings of duration from 10 to 90 s. In addition, these recordings are taken from 126 patients. Moreover, the total time of the recording is about 5.5 h with 6898 respiratory phases of which 1864 have crackles, 886 have wheezes, and 506 enclose wheezes and crackles altogether. The data includes noisy recordings and clean respiratory sounds that emulate real life conditions. The patients cover different groups of age, like elderly, adults and children.
The collection of signals were obtained from the publicly available 2017 Int. Conf. on Biomedical Health Informatics In addition, another team from the Aristotle University of Thessaloniki (AUTH) and the University of Coimbra (UC) acquired respiratory sounds at the Papanikolaou General Hospital, Thessaloniki, the General Hospital of Imathia (Health Unit of Naousa), Greece, and the General Hospital of Imathia (Health Unit of Naousa), Greece. The dataset covered 126 subject with various types of respiratory diseases including pneumonia, BRON, and COPD. Each patient recorded lung noises for duration varying between 10 and 90 s, and all signals were re-sampled at a sampling frequency of 4 kHz.

Evaluation Metrics
The performance of the proposed FrWCSO-based DRN is analyzed based on the evaluation measures, like TPR, TNR, and testing accuracy.
(a) TPR It is a measure which calculates the true positive results of detected abnormalities and is expressed as, (b) TNR TNR is a measure which computes the true negative results of detected abnormalities and is formulated as, (c) Testing accuracy Testing accuracy is a measure which calculates the nearness or correctness of detected abnormalities and the equation is expressed as, where, a denotes true positives, b represents true negatives, p and q implies false positives, and false negatives. Figure 4 presents the implementation outcomes of developed Fr-WCSO-based DRN method. The input sound wave is depicted in Fig. 4a, the pre-processed sound wave is shown in Fig. 4b, the spectral centroid feature output, and the spectral flux feature output is presented in Fig. 4c, d.

Experimental Assessment
The experimental assessment of the developed technique is depicted in Fig. 5 below. Here, the input audio signal is taken from dataset 1 and dataset-2 randomly, which is depicted in Fig. 5a, and the assessment employed after the spectral centroid is shown in Fig. 5b.

Performance Analysis
This section elucidates the performance assessment of the developed Fr-WCSO-enabled DRN method using dataset-1 and dataset-2 considering the evaluation metrics, such as TPR, TNR, and accuracy.
(i) Analysis based on dataset 1 Figure 6 elucidates the performance assessment of the developed method based on dataset-1 by considering the evaluation measures, such as TPR, TNR, and testing accuracy. Figure 6a presents the analysis based on TPR. With the training data 60%, the TPR value computed by the developed Fr-WCSO-based DRN with iteration 10 is 0.910, iteration 20 is 0.913, iteration 30 is 0.916, and iteration 40 is 0.919. The assessment based on TNR is shown in Fig. 6b. By considering the training data as 70%, the developed Fr-WCSO-based DRN measured a TNR with iteration 10 is 0.883, iteration 20 is 0.886, iteration 30 is 0.890, and iteration 40 is 0.893. The analysis using testing accuracy is portrayed in Fig. 5c. Considering 80% training data, the testing accuracy value computed by the developed Fr-WCSO-based DRN with iteration 10 is 0.923, iteration 20 is 0.926, iteration 30 is 0.930, and iteration 40 is 0.933.
(ii) Analysis based on dataset-2 The performance analysis of the developed method using dataset-2 based on the evaluation measures is shown in Fig. 7. The assessment using TPR measure is depicted in Fig. 7a. By considering the training data as 60%, the developed Fr-WCSO-based DRN measured a TPR with iteration 10, iteration 20, iteration 30, and iteration 40 is 0.921, 0.925, 0.928, and 0.931. Figure 7b presents the assessment based on TNR. With the training data 70%, the TNR value computed by the developed Fr-WCSO-based DRN with iteration 10 is 0.873, iteration 20 is 0.876, iteration 30 is 0.880, and iteration 40 is 0.883. The analysis using testing accuracy is portrayed in Fig. 7c. When the training data is 80%, the testing accuracy value computed by the developed Fr-WCSObased DRN with iteration 10 is 0.916, iteration 20 is 0.919, iteration 30 is 0.922, and iteration 40 is 0.925.

Comparative Analysis
This section elucidates the comparative assessment of the developed technique using dataset-1 and dataset-2 based on the evaluation measures such as TPR, TNR, and testing accuracy.
(i) Analysis using dataset-1 Figure 8 portrays the comparative assessment based on dataset-1 with respect to TPR, TNR, and testing accuracy metrics. Figure 8a represents the analysis using TPR. For the 70% TPR, the value of TPR measured by the developed Fr-WCSO-based DRN is 0.930, while the value of TPR measured by the existing techniques, such as Random Forest classifier, machine learning, DNN, CNN, and WCSO-based HAN [9] is 0.751, 0.819, 0.858, 0.872, and 0.912. The performance gain measured by the developed method in comparison with the existing methods is 19.307%, 11.938%, 7.750%, 6.321%, and 2.008%. The assessment based on TNR is portrayed in Fig. 8b. When the training data is 80%, the developed Fr-WCSO-based DRN computed a TNR value 0.912, while the TNR value computed by the existing methods, such as Random Forest classifier is 0.718, machine learning is 0.771, DNN is 0.809, CNN is 0.841 and WCSO-based HAN is 0.893. The performance improvement achieved by the developed method in comparison with the existing methods is 21.265%, 15.400%, 11.248%, 7.705%, and 2.066. Figure 8c depicts the assessment with respect to testing accuracy. With the training data 60%, the testing accuracy value measured by the Random Forest classifier is 0.710, machine learning is 0.755, DNN is 0.803, CNN is 0.847, WCSO-based HAN is 0.883, and developed Fr-WCSO-based DRN is 0.902. The performance gain measured by the developed method in comparison with the existing methods is 21.340%, 16.359%, 11.042%, 6.171%, and 2.177%.
(ii) Analysis baseddataset-2 Figure 9 presents the comparative assessment using dataset-2 by considering the evaluation measures, such as TPR, TNR, and testing accuracy. The assessment based on TPR is presented in Fig. 9a. When the training data are 60%, the developed Fr-WCSO-based DRN computed a TPR value of 0.931, while the TPR value computed by the existing methods, such as Random Forest classifier is 0.743, machine learning is 0.782, DNN is 0.830, CNN is

Comparison of Running Time Complexity
While analyzing the results that are tabulated in Tables 3 and  4, it is clear that the developed Fr-WCSO-based DRN computed a lower running time complexity. The effectual results are obtained as the model is well trained with the proposed Fr-WCSO and hence increasing the learning rate of the Deep Residual Network. The performance gain measured in terms of running time complexity by the developed method in comparison with the existing methods (see Figs. 11 and 12).

Convergence Assessment
The convergence assessment of the developed technique for both testing and training phase for dataset 1 and dataset-2 are depicted in Fig. 13 given below. During the 20th iteration, the fitness evaluated by the proposed method using the dataset-1 is 0.853 for the training stage and 0.661 for the testing stage. Similarly, while considering the 100 th iteration, the fitness evaluated by the proposed technique using the dataset-2 is 0.94 for the training data and 0.899 for the testing data. Here, the analysis shows that, for an increase in the iteration number, the fitness function also increases for both the testing and training data representing the proposed method's convergence.

Conclusion
This research presents a novel pulmonary abnormality detection model to detect the pulmonary abnormalities using proposed optimization-enabled deep learning method, called FrWCSO-based DRN. Here, the necessary features, like BFCC, short-term features like spectral flux, spectral centroid, PSD, statistical features like mean, standard deviation, entropy, energy, kurtosis and wavelet transform features are effectively extracted from the respiratory sound signal. After extracting the significant features, data augmentation is done for minimizing the over fitting problems. The proposed FrWCSO is employed for selecting the significant features needed for further processing. Moreover, DRN classifier is accomplished for detecting the pulmonary abnormalities such that the training practice of DRN is done using the developed optimization algorithm, named FrWCSO, which is newly designed by the integration of FC and WCSO. Meanwhile, WCSO is the combination of WCA and CSO. Furthermore, the developed FrWCSO-based DRN outperformed various existing techniques and achieved effective performance in terms of TPR, TNR, and testing accuracy with the higher values of 0.963, 0.932, and 0.948. The future work would be the consideration of devising larger datasets to improve the detection performance.     Funding Not applicable.

Declarations
Competing Interests No conflict of interest exists. We wish to confirm that there are no known conflicts of interest associated with this publication and there has been no significant financial support for this work that could have influenced its outcome.
Ethics Approval and Cconsent to Participate Approved by RDC of Mansarovar Global University Madhya Pradesh India.

Consent for Publication Yes.
Preprint A preprint version of this paper is available on research square (https:// www. resea rchsq uare. com/ artic le/ rs-10825 41/ v1), https:// doi. org/ 10. 21203/ rs.3. rs-10825 41/ v1, to gain feedback from the community, and start making changes in manuscript prior to peer review in a journal.