Ensemble of Deep Learning Enabled Modulation Signal

Automated classiﬁcation of underwater acoustic signals allows to eﬀectively utilize the acoustic spectrum by certain actions such as interference circumvention and marine mammal protections. The commonly utilized modulation techniques for underwater acoustic communication are MPSK (BPSK, QPSK, and 8PSK) and MQAM. Detection of modulation type of a received waveform enables sonar and communication in a similar bandwidth with least collision and it can determine the system functioning exterior to the permitted regime. Automated modulation classiﬁcation models can make utilize of machine learning (ML) and deep learning (DL) techniques, particularly for underwater acoustic communication. With this motivation, this study designs a new ensemble of deep learning based modulation signal classiﬁcation (EDL-MSC) models for underwater acoustic communication. Initially, an impulsive noise pre-processor was used for eliminating the impulse from the target signal. Besides, three deep learning models namely bi-directional long short term memory (BiLSTM), gated recurrent unit (GRU), and stacked sparse autoencoder (SSAE) models are used to derive features in the temporal waveform and square spectra of the pre-processed signal. In addition, black widow optimization (BWO) is applied for the optimal hyperparameter tuning of the DL models. Lastly, an ensemble of voting schemes is applied to integrate the outcome of the three DL models. The proposed model has the ability to perform eﬀective modulation classiﬁcation process in underwater acoustic communication. The performance validation of the EDL-MSC technique takes place under several aspects and the comprehensive comparative analysis ensured the supremacy of the EDL-MSC technique over the recent approaches.


Introduction
Underwater acoustic communication (UWA) is considered one of the more sophisticated wireless transmission techniques [1]. The UWA channels make transmission difficult because of their own characteristics, namely prolonged time, serious inter-symbol interference (ISI), and narrow bandwidth. These features have a serious effect on the stability of transmission methods and cause obstacles to higher-rate UWA [2]. The Automated Modulation Classification (AMC) of the UWA find several applications in the civilian and underwater acoustic confrontation scenarios, have gained considerable interest. But, traditional approaches of modulation classification are inadequate efficiency from shallow water environments because of the difficulty of marine environments like the reverberation, the impulsive noises, and the complex underwater acoustic channel that have multiplicity of multipath, narrow bandwidth, and long delay [3]. Thus, research of modulation recognition method better suited for UWA channel is of major importance. Fig. 1 illustrates the outline of UWA network.
The AMC is an essential part of the information recovery and attributes identification of received signals. Recently, with the growing demand for marine information acquisition and constantly evolving ocean-related technology [4], AMC for UWA transmission signal has become hot research topic. But, due to the difficulty of the marine environments, advances in this area have become very slow. Particularly in military applications, the transmitted signal is often burst and short has improved the complexity of AMC. Traditional AMC to UWA transmission signal was based mainly on pattern detection approach. This can be performed in two phases: classification and feature extraction. First, Distinct features are constructed based on domain knowledge and later fed into distinct classifiers for classification. Generally, the AMC technique is separated into two classes, that is, the feature-based pattern recognition and likelihood-based decision theoretic methods [5]. The feature-based method is wildly utilized for its lower complexity and better performance. The conventional feature-based AMC method contains two major phases, classification and feature extraction. In the feature extraction stage, variety of features has been adapted, namely Stockwell transform, wavelet transform, higher order cumulants and moments [6], etc. Classifier includes support vector machine (SVM), decision tree classifier, clustering algorithm, neural network classifier is frequently utilized in the classification phase. But this method requires massive amount of expertise and prior knowledge to develop a feature extraction model, in the conditions of lack of priori knowledge, the generalization and accuracy are often unsatisfying, particularly in the complex UWA channel [7]. In recent times, deep learning (DL) method has attained outstanding performance in several fields, particularly in classification tasks, due to the capability for learning higher-level features hidden in the data. Modulation classification with DL approaches has become an increasingly important research field. Also, it is demonstrated that simple convolution neural network (CNN) outperforms algorithm with decade of expert feature search for radio modulation [8].

Underwater Acoustic Communication Network
This study concentrates on the design of an ensemble of deep learning based modulation signal classification (EDL-MSC) models for underwater acoustic communication. The EDL-MSC technique initially applies an impulsive noise pre-processing technique to remove the noise that exists in the target signals. In addition, an ensemble of three DL models such as bi-directional long short term memory (BiLSTM), gated recurrent unit (GRU), and stacked sparse autoencoder (SSAE) models are utilized for modulation signal classification in underwater acoustic communication, Moreover, the hyperparameter tuning of the DL models take place utilizing the black widow optimization (BWO) technique. For examining the better outcomes of the EDL-MSC approach, a group of simulations are implemented and the results are inspected under several aspects.

Literature Review
Lee-Leon et al. [9] proposed a receiver technique by exploring the ML-DBN method -to combat the signal distortion created by the multi-path propagation and Doppler effect. Firstly, the received signals are segmented into frames beforehand this frame is preprocessed individually by a pixelization method. Next, utilize DBN based de-noising approach, feature is extracted in this frame and utilized for reconstructing the received signals. Lastly, DBN based classification of the recreated signals occurs. Jiang et al. [10] proposed the PCA for effective extraction of power spectral and square spectral feature of UWA signal at the existence of noise, multipath, and Doppler made in UWA channel. With the feature attained by PCA, an ANN classification is adapted for recognizing modulation of UWA transmission signal. In Liu et al. [11], a deep heterogeneous network combines LSTM and hybrid dilated convolution network for automatic capturing the hidden feature of data series for accomplishing the modulation detection of 4 UWA transmission signals, including QPSK, OOK, 2FSK and2PSK. In Wang et al. [8], hybrid time series network architecture was scheduled to AMC. It could accommodate the variable-length signal data for matching the fixed-length input requested from the shared NN, as well as it has the capacity to appropriately handle the zero data from the signal order for improving the resultant loss. Wei et al. [9] proposed a technique for automated modulation classifier of digital transmission signals utilizing SVM based hybrid feature, cyclostationary, and data entropy. During the presented approach, with integrating the concept of entropy and cyclostationary dependent upon the current signal feature, it can be present 3 novel features for supporting the classifier of digital transmission signals.
Yao et al. [10] introduced a modulation recognition technique based generative adversarial network (GAN) for increasing the strength of modulation recognition for UWA transmission signals. The generator of GAN has trained for improving the distorted signal and the discriminator is trained for extracting feature in UWA transmission signal and automatically categorizing them. Li-Da et al. [11] proposed a DNN method for AMC of UWA transmissions integrating the LSTM and CNN. The LSTM learns in amplitude and phase and CNN learns from time domain IQ data. Multipath fading UWA channel with Doppler frequency shift and alpha-stable impulse noise is modelled to signal dataset generation based on real marine environment information.

Signal Model
AMC plays an essential role in signal demodulation and classification at the receiver. Unlike other widely employed ML techniques, while using the DL methods for modulation, it isn't needed for extracting the features. Without disturbing the classification effects, system efficiency is improved and the process is optimized. We focused on channel models on the basis of three aspects: Gaussian noise, multipath fading, and Doppler effect. These channel models are assumed as convolutional core through additive noises [12]. For analysis, we determine the received signals method as follows Whereas x(t) represents the modulated communication signals and y(t) indicates the received signals afterward interference with additive noise and underwater acoustic channel. Considering the difficulty of the underwater acoustic channel, for better analysing the impacts on received signals, it can be utilize h(t, δ) for characterizing the underwater acoustic channel parameter. α i (t) signifies the attenuation coefficient. δ(t) epitomizes the arbitrary time interval. N indicates the count of multi-paths. The additive noises n(t) consider that statistical property satisfies Gaussianity. This operator ⊗ implies convolutional.

Modulation Model
The widely employed modulation method from underwater acoustic transmission is MPSK (BPSK, QPSK, and 8PSK) and MQAM. The MPSK is more frequently utilized modulation method. The phase shift dependent upon amplitude and frequency is constants signal modulation and the phase as variable [13]. The MPSK signal is given by following equation: In which A denotes the amplitude, w c indicates the angular frequency, and phase θ m denoted a uniformly spaced group of phase angles.
In the equation, M shows the symbol number and the phase intervals among 2 nearby signals from the modulation signals are 2πlM . For instance, the phase spacing of 4 symbols QPSK are π/2. The MQAM signal appearance is slightly distinct from that of MPSK.
Here A i = a i cos(∅ i ) and B i = b i sin(∅ i ), correspondingly; it is modulate 2 distinct carriers from MQAM modulation, where a i and b i represents 2 orders that should be transmitted.

Impulsive Noise Pre-processing
The underwater acoustic noise is a wider dynamic range, mainly under the occurrence of the maximum amplitude impulses. The noteworthy numerical transformation among diverse waveform instances raises the possibility of gradient imbalances and modeled non convergence at the time of network trained [14]. Therefore, it becomes essential for employing impulse decrease and normalize pre-processing on received signal. The impulse noise preprocessing approach is applied for non-linearly suppressing the location where the amplitude exceeds the chosen threshold τ r from the received signals (n). The denoised signal is defined using Eq. (5) [15]: where τ 0 indicates a constant coefficient. Also, y ′ (n) undergo normalization of attaining the end outcome of the impulse noise preprocessing approach as given below.

Ensemble of DL Models
At this stage, the preprocessed signals are fed into the DL models for learning the features and modulation classification. At the weighted voting based ensemble method, the DL techniques were combined and the maximal result was selected by weighted voting approach. The voting approach was trained with all individual vectors and the respective 10 fold CV accuracy is then assessed as FF [20]. To offer the amount of classes as n and D base classifier techniques to vote, the prediction class c k of weighted vote to all the instances, k is determined as: where ∆ ji stands for the binary variables. Once the i th base classifier method categorizes the instance k as to j th class, afterward ∆ ji = 1; else, ∆ ji = 0. w i refers the weight of i th base classifier technique in an ensemble. Then, the accuracy is signifies represented by

GRU Model
GRU is a modified version of LSTM model. It includes merits of RNN as it automated learns the feature and model of the long term dependencies. It can be applied to prediction and classification processes. In the structure of GRU model, the input as well as forget gates in the LSTM model are combined as reset gate from the GRU model that helps for determining the way of combining new data with the earlier data. Another gate in the GRU model is named an update gate; it computes the amount of data from earlier states and is stored to the present time [21]. So, it is treated that the GRU is one gate lesser than LSTM model. Besides, the call as well as hidden states in the LSTM are combined into 1 hidden state from the GRU model. These modifications create the GRU model encompasses less number of parameters and fast training speed. It also necessitates lower amount of data for generalization purposes. The GRU method can be mathematically defined as follows.
Eqs. (10) and (11) depicts the way of computing update gate z n and reset gate r n in the GRU neuron. W z represents the weight of z n , W r signifies the weight of r n , and 0 designates the sigmoid operation. The inmost element [h n−1 , x n ] signifies the total of vectors h n−1 and x n . Higher values of z n specifies that more data can be handled via the present cell whereas less data has been managed by the earlier cell [16]. r n refers that if the value of formula is equivalent to zero, the data in the preceding cell was discarded. Eqs. (6) and [7] illustrate that computation of pending resultant values h and last resultant value h n of GRU-NN. h n−1 stands for the outcome in preceding cell, W signifies the weight of n, andtanh implies the hyperbolic tangent operation. h n represents the attained by multiplying h n−1 of preceding cells with r n , plus x n , multiplying by W , and utilizing the hyperbolic tangent operation. h n refers the sum of 2 vectors. One is attained by multiplying 1 − z n by h n−1 and additional has been reached by multiplying z n by h n .
BiLSTM Model RNN architecture is broadly utilized for analyzing and predicting time series data. But RNN frequently suffers from gradient vanishing problems in the training process. Hence, it is hard to remember the prior data, such as long-term dependency issues [17]. For handling this issue, LSTM network is presented that has memory function under the long-span. The presented network uses gate control model for adjusting data flow and methodically defines the count of incoming data that retained from all the time steps. The architecture of LSTM unit is collected of storing units and 3 control gates (such as forgetting, input, and output gates). x z and h z corresponded to the hidden as well as input states of time z, correspondingly. f z , i z , and o z denotes forgetting, input, and output gates, correspondingly. C z denotes the candidate data to the input that storing, and then the count of storing was managed by input gates. The evaluation process of all the gates, input candidates, cell states, and hidden states are given below: Whereas W f , W i , W o , and W c represents the weight matrix of forgetting, input, output, and update states, correspondingly. b f , b i , b o , and b c represents the bias vector of forgetting, input, output, and update states, correspondingly. x z indicates the time series data of existing time interval z, and h z−1 indicates the outcome of memory units from the preceding time interval z − 1.
For building a further accurate predictive technique, the BiLSTM network was utilized that performs as forward as well as backward LSTM network to all trained order [18]. The 2 LSTM networks were linked to similar resultant layer for providing whole context data for all sequence points.
SSAE Model DL method is a novel field in ML approach. Its inspiration lies in simulating and building the NN system of human brain to analytical learning. It reproduces the method of human brain for comprehend information. During this study, the deep SSAE is adapted for feature reconstruction and reduction [19]. SSAE forms a higher-level feature by integrating lower-level features to determine the distributed feature demonstration of protein feature data. The SSAE has unsupervised system i.e., a large-scale non-linear system composed of multi-layer neuron cells where the output of existing state is fed into connectivity state. SSAE or SAE network is largely consisted of, the encoder and decoder phases, in which the encoded network compresses higher-dimension into lower-dimension features. The decoding network takes the responsibility of returning the new input layer-wise, and the network architecture was symmetrical with encoding network. In the coding phase, the initial data x is mapped to a hidden state as follows Now, σ 1 represent a non-linear function, b 1 represent the bias andw 1 denotes the weight of the encoder. Next, the original data is recreated by the decoder phase: Whereas w 2 represent the weights of decode network and b 2 denotes the bias. The aim of SAE is for making the resultant closer to it feasible to input by reducing loss functions: In which N represent the amount of hidden states, β indicates the weight of sparse penalty item,ρ j shows the average activation values of hidden state, and ρ indicates the sparse parameter. The SSAE network with 2 hidden states, where the decoder phase hasn't been exposed, for highlighting the feature reduction function of the network. Like SAE, the significant to trained models is for learning the parameter θ = (W, b) that enables the method to have minimal output and input deviation. When the optimum parameter θ is attained [20], the SSAE yield function R dχ → R d h(2) that transform original information to a lower-dimension space.

Hyperparameter Tuning using BWO Algorithm
For optimally tuning the hyperparameters of the DL models, the BWO algorithm is applied and it assists to improve the modulation signal classification outcomes. A novel and effectual meta-heuristic optimized technique to nonlinear optimized issues are BWO technique is initially referred to by [21]. The BWO technique was stimulated by mating performance of black widow spider. The validation of their efficiency from obtaining optimum feasible solutions is evaluated by 51 distinct benchmark functions and 3 engineering techniques. This technique contains several benefits, comprising initial convergence and an optimum fitness value related to other approaches. The BWO improves the convergence speed and removes the local optimal problem by offering proper functioning from exploitation and exploration stages. Also, it can be worth noticeable BWO has been able to investigate huge zone for obtaining optimum global solutions and therefore it can be helpful from resolving many optimized issues with varied local optimal.
The BWO technique has 4 important phases: the initialize of population, procreation, cannibalism, and mutation. In the initial phase, the candidate solution that is named widows was distributed arbitrarily from the search spaces [22]. During the procreation phase, pairs of parents were chosen arbitrarily, and novel generation (spider baby) is generated based on the subsequent formulas: where x 1 and x 2 represents the parents, B 1 and B 2 implies the spider babies or novel generations, and a indicates the arbitrary vector amongst zero and one. During the cannibalism phase, population destruction takes place in 3 distinct manners. During the primary variety of population destructions, one female spider eats one male (female as well as male are resolved dependent upon its fitness). During the secondary variety, stronger spiders eat weaker ones. During the tertiary variety, babies eat its mother. In BWO, the cannibalism rating (CR) defines the amount of survivors from the populations. During the mutation stage, a mutation function was executed to an arbitrary amount of population members.  The overall classification result analysis of the EDL-MSC technique under SNR values of -20dB and -10dB is offered in Table 1. With -20dB SNR value, the EDL-MSC technique has classified the 8PSK signal with the prec n , recal l , and F score of 77.66%, 73%, and 72.56% correspondingly. At the same time, the EDL-MSC technique has categorized the BPSK signal with the prec n , recal l , and F score of 73.73%, 87%, and 79.82% correspondingly. Likewise, the EDL-MSC technique has identified the 16QAM signal with the prec n , recal l , and F score of 91.01%, 81%, and 85.71% correspondingly. Similarly, the EDL-MSC technique has classified the 64QAM signal with the prec n , recal l , and F score of 61.47%, 67%, and 64.11% respectively. The experimental results showcased that the EDL-MSC technique has attained higher performance with the SNR value of -20dB compared to -10dB. For instance, under SNR of -20dB, the EDL-MSC technique has obtained average prec n , recal l , and F score of 75%, 74%, and 74.45% correspondingly. Besides, under SNR of -40dB, the EDL-MSC technique has resulted in average prec n , recal l , and F score of 59.63%, 58%, and 58.38% correspondingly.        Table 2 and Fig. 9 demonstrate the modulation signal classification performance analysis of the EDL-MSC technique under varying SNR values and number of epochs. From the figure, it is depicted that the classification accuracy gets increased with an increase in SNR values and remains saturated with the SNR value of 15dB. It is also noticed that the classification accuracy seems to be higher for an increase in epoch count.   Table  3 and Fig. 10

Conclusion
This study has developed a novel EDL-MSC technique for classifying the modulation signal that exist in underwater acoustic communication. The EDL-MSC technique encompasses several subprocesses namely impulsive noise based pre-processor, ensemble of DL based classification (GRU, BiLSTM, and SSAE), and BWO based hyperparameter tuning. The application of BWO algorithm assists in properly selecting the hyperparameter values of the DL models and thereby achieves improved classification performance.
To investigate the improved efficiency of the EDL-MSC technique, a comprehensive comparison study is made and the results pointed out the enhancements of EDL-MSC approach on the recent approaches interms of different measures. Therefore, the EDL-MSC technique can be utilized as an effective tool for modulation signal classification in underwater acoustic communication. In future, hybrid DL models can be used to improve the overall classification outcomes of the EDL-MSC technique. Declarations Funding: The authors receive no financial support for the research or publication of this article.

Conflict of interest:
The authors declare no conflicts of interest to publish this paper.
Availability of data and material: Data sharing is not applicable as no datasets were generated or analyzed during this study Code availability: The code generated during this study is available with the corresponding author Ethics approval: This research does not involve human or animal participants. Consent to participate: Not applicable