A convolution neural network approach to Doppler spectra classification of 205 MHz radar

Wind profiler radars are capable of measuring three-dimensional wind profiles at various altitudes of the atmosphere, at very high temporal and spatial resolution. The Advanced Centre for Atmospheric Radar Research (ACARR), located at Cochin University of Science and Technology (CUSAT), operates the world’s first 205 MHz stratosphere-troposphere wind profiler radar which provides three-dimensional wind profiles for an altitude range of 315 m to 20 km. During non-rainy condition, the radar Doppler power spectrum bears the signature of ambient air motion whereas during rainy conditions, it contains signatures of both ambient air motion and fall velocity of rain droplets. The classification of Doppler power spectra for rainy (Precipitation) and non-rainy (Clear) conditions is necessary as wind profile retrieval from the former needs careful separation of ambient air motion from fall velocity of droplets. A manual classification of the power spectrum is cumbersome, time-consuming, and therefore not practical due to the vast database. This work intends to automate Doppler power spectra classification using the deep learning Convolutional Neural Network (CNN). The proposed Convolutional Neural Network model gives a k-fold validation accuracy of 99.77% and testing accuracy of 99.60% for power spectra classification. The performance of CNN is compared against other popular machine learning classifiers such as Support Vector Machine, Decision Tree, K Nearest Neighbour and Naive Bayes. The performance comparison results show that the proposed CNN outperforms other models in radar Doppler power spectra classification.


Introduction
Wind profiler radars measure accurate three-dimensional wind profiles from different layers in the atmosphere with high temporal and spatial resolutions. Doppler Beam Swinging (DBS) is the most commonly used method for wind profile estimation wherein radar signals are transmitted towards the atmosphere in zenith and off-zenith directions, the azimuth angles of which are orthogonal to one another. The received echoes are Doppler shifted; the Doppler power spectra (DPS) thus obtained will have the information of the wind velocity in the atmosphere. A unique very high frequency (VHF) stratosphere-troposphere (ST) wind profiler radar (WPR) at 205 MHz has been operational in Cochin (10.04 ∘ N; 76.33 ∘ E), India, since 2017 (see Fig. 1). This facility provides accurate three-dimensional wind profiles for an altitude range of 315 m to 20 km (Kottayil et al. 2016;Mohanakumar et al. 2017;Kottayil et al. 2018;Nithya et al. 2019;Kottayil et al. 2020;Sujithlal et al. 2022). Shailesh S., Judy M. V. and Ajil Kottayil contributed equally.
The main scientific objective behind the installation of this radar is to study the dynamics of the Indian summer monsoon. It has been shown in Fukao et al. (1985) that VHF radars are sensitive to both Bragg and Rayleigh scattering, which enables the simultaneous observation of both ambient air motion as well as fall velocity of precipitation from the radar echoes during rainy conditions. One of the significant advantages of the 205 MHz radar is its ability to discern background air motion from precipitation echoes. The 205 MHz radar is operated 24/7 during the southwest monsoon season (June, July, August, and September) every year, and usually, the Doppler power spectra are generated every 7 min. However, during monsoon season, due to frequent rains, the Doppler power spectra will have signals of both ambient air motion and fall velocity of raindrops. In order to derive wind profiles from the Doppler power spectra obtained during rainy conditions, echoes of the ambient air motion must be separate from the precipitation echoes as the former contains the actual air motion. Since Doppler power spectra are frequently generated from radar observations, the first step is to classify the Doppler power spectra obtained during rainy (Precipitation DPS) and non-rainy (Clear DPS) conditions. A manual inspection of the power spectrum one by one for classification is cumbersome, time-consuming, and therefore not practical due to the enormous database. To circumvent this issue, the use of automated classification methods is inevitable.
Artificial intelligence/machine learning techniques are widely in use in several areas of radar remote sensing. Marzano et al. (2007), and Nanzer et al. (2009) have proposed a machine learning model that classifies radar targets using a Bayesian approach. Roberto et al. (2017) proposed a hydrometeor classifier based on parallel support vector machines that use radial basis function kernels. Ostrovsky and Yanovsky (2006) used a multi-parametric decision technique to create a neural network classifier for classifying turbulence. Ibrahim et al. (2009) demonstrate the potential of a multi-layer perceptron neural network for ground vehicle categorization using forward scattering radar. A fuzzy logic approach for radar target classification is documented elsewhere (Andric et al. 2005;Al-Sakka et al. 2013;Thompson et al. 2014). Convolution neural network (CNN), a deep learning-based classification technique, has been used to classify radar targets in various applications, such as polarimetric synthetic aperture radar image classification , hyperspectral image classification (Yu et al. 2017), vehicle classification (Capobianco et al. 2017), drone classification (Kim et al. 2016), etc. Lin et al. (2021 demonstrated a framework for deep learning-based radar target classification by merging data augmentation techniques and region of interest data with a residual CNN. It is challenging to manually design suitable features for radar image target classification due to the complex scattering processes and noises in radar images. The ability to automatically learn and extract useful features from the data makes deep learning (DL) models like CNN best suited in radar image target classification. In this work, for the first time, CNN is employed for the 205 MHz wind profiler radar Doppler power spectra classification for rainy and non-rainy conditions. The primary motive is to implement a robust automated classification algorithm based on a DL approach and explore the efficacy of CNN for radar echo identification. The proposed DPS classification algorithm based on CNN can be applied on any other WPR, which demands segregation of DPS for rainy and non-rainy conditions. The significant contributions of this work are as follows: • Creation of DPS image dataset from one year of WPR raw data and labeling of the image dataset • Developed a computational framework using deep learning for the automated doppler power spectra classification of 205 MHz WPR.

Methodology
The high-level architecture of the proposed solution is given in Fig. 2, which includes data acquisition, preprocessing, and classification. Furthermore, each of these steps is discussed in detail below.

Data acquisition
The 205 MHz ST radar at Cochin is an active phased array radar with 619 antennae elements and a corresponding number of Transmitter receiver modules (TRM), each 500 W. The antennae are arranged in a triangular grid with an inter-element spacing of 0.7 λ where λ is 1.43 m. The effective aperture area of the radar is 536 m 2 , with a peak power aperture product of 1.6x10 8 Wm 2 . The gain of the entire array is 35 dBi with a half-power beamwidth of 3.2 ∘ . Electronic beam steering helps in tilting the beam 0-30 ∘ in the off-zenith direction and 0-360 ∘ in the azimuth direction, both with an angular resolution of 1 ∘ . Coded and uncoded modes of operations can be performed with the ST radar. For code length, 1-64 bits can be given in steps of 2n. Baud can be set within a range of 0.3 to 4.8 μs in steps of 0.3 μs. The technical specifications of the 205 MHz radar are given in Table 1. More details on radar validation and technical details can be found in Kottayil et al. (2016), and Mohanakumar et al. (2017).

DPS image
The received radar echoes are used to estimate the Doppler power spectrum. As a first step, the complex time-domain signals (in-phase and quadrature-phase) after pulse decoding are averaged for n number of pulses for each range gate separately. The above process is repeated until ample fast Fourier transform (FFT) points are archived. After applying windowing operations (rectangular, Hamming, or Hanning) on time series data, these are subjected to FFT to generate the Doppler power spectra. The DPS for non-rainy and rainy conditions are shown in Fig. 3. For non-rainy cases, the DPS contains information on the ambient air motion. In contrast, in rainy cases, both ambient air motion and the fall velocity of rainfall can be seen. Unlike non-rainy conditions, the rainy condition will have multiple Doppler spectral peaks. Since our main motive is to classify DPS for rainy and non-rainy cases, we have generated the images of DPS for the whole of the year 2017. The images of DPS are then segregated for rainy and non-rainy cases with the help of domain experts. The ACARR maintains a logbook where the scientist makes entries and remarks for every profile scan. This logbook contains remarks on rain gauge readings. The profiles taken during rainfall are labeled as rainy profiles in the logbook. Such profiles are then visually cross-examined before adding to the rainy case. After visual   The color bar unit is in dB, which is arbitrary examination, the profiles affected by virga events (Rosenfeld and Mintz 1988) are also added to the rainy cases to create a more generalized classification model.

Preprocessing
Before training CNN with the Doppler power spectra images, the images are preprocessed. The images in the dataset are three-channel (RGB) color images of varying sizes, which may have been affected by different noise signals. In order to make the images into a uniform size, remove the noise, and enhance the quality, these images are processed using a preprocessing pipeline (see Fig. 4). Contrast-limited adaptive histogram equalization (CLAHE) (Zuiderveld 1994) is applied on the input image for boosting local contrast and enhancing edge definitions. The output of histogram equalization is then subjected to filtering using the Wiener filter (Lim 1990). Filtering is applied to reduce the noise overamplified by the adaptive histogram equalization. The Wiener filter is an efficient linear adaptive filter that smoothens noise while preserving edges and other high-frequency parts of an image. Then, resize operation is performed on the denoised images to scale down the dimensions of images to 120 × 120 pixels with three channels. The resize operation reduces the memory required and computational complexity without changing the image properties. The output images are then normalized to a smaller interval [0,1] for making the computation easier. Once the preprocessing is over, the preprocessed images are used for training the classifier models.

Implementation of CNN
A deep learning-based model called the convolution neural network model (LeCun et al. 2010) is used for building the classifier. CNN-based models are widely used in various research domains as these models can automatically extract vital features and classify images with better accuracy.

Working of CNN
The architecture of the CNN models can be divided into two major phases. The first phase of the model is the convolutional phase that extracts the relevant features automatically, and the second phase is the neural network phase that performs the classification. The features are extracted from the images using different kernels on the forward pass in the feature extraction phase. The primary operations associated with this phase are convolution, activation function, and pooling. The main elements of the convolution operation are the input image, feature detector (also known as Kernel or filter), and output features (Feature maps). Feature maps are obtained by an elementwise multiplication of the feature detector with the input and then summing up the results into a single output pixel. The feature detectors then slide over every input image location and repeat the above operation within the image. After the convolution operation, the activation function is applied to every element of output features to introduce nonlinearity in the output. The pooling layer reduces the dimension of the output features, keeping the essential information intact. So pooling limits the number of parameters to learn and the number of computations performed. The output obtained from the feature extraction phase is a set of feature maps, which is flattened into a vector. This vector holds all the relevant features that are needed to be extracted from the preprocessed DPS image. The classification phase consists of several nodes or neurons distributed over various layers. The layers in this phase can be generally categorized as the input layer, hidden layer, and output layer. The input layer takes the feature vector of the previous phase as input for the next layer. The hidden layer is located between the input layer and output layer. Every node (except the input layer nodes) receives inputs from a subset of other nodes and performs nonlinear transformations from the inputs to outputs. The output layer is the last layer of the neural network phase that produces the classification results. The learning of the CNN model happens in two passes: forward pass and backward pass. In forward pass (forward propagation), the data flow is from phase one to phase two, whereas data (backpropagation) flow is reversed in the backward pass. The classification error is calculated after every training instance, and the weights and kernels are updated accordingly in the backpropagation. While designing a custom CNN model for image classification on new datasets, one can repeat the convolution, pooling, and activation function layers an arbitrary number of times to ensure the proper convergence and desired classification accuracy. The inference from the trained CNN model can be used for classifying new images during testing. Unlike the training, none of the learned parameters are updated during testing, and the data flows only in one direction from the convolutional phase to the classification phase.

Architecture of proposed CNN model
In this work, we have created a simple sequential CNN model with three convolution layers to extract features followed by three layers of the fully connected neural network for classification. The first layer of CNN is the Convolutional 2D layer which takes the preprocessed images of dimension 120 × 120 × 3 as input and produces output features of dimension 120 × 120 × 10. The first convolutional layer uses ten filters, each of shape 3 × 3 with stride [1,1] and padding. The output of the first convolutional layer is fed into a Rectified Linear Unit (ReLU) (Nair and Hinton 2010) activation function followed by a max-pooling layer (Nagi et al. 2011) which selects the maximum element from the region of the feature map covered by the filter (2 × 2) with stride [2,2]. ReLU is a simple activation functions that can produce excellent results in the classification problem when clubbed with layers of CNN. The mathematical definition of ReLU activation function is given in Eq. 1 and its value ranges from 0 to +∞ . The output shape of the Max pooling layer is 60 × 60 × 10. After extracting the high-level features, the above stages are repeated thrice to extract lowerlevel features.
The second Convolutional layer takes the output of the previous Max pooling layer as input and extracts the lower-level features using filters of shape 3 × 3 with stride [1,1] and padding. At this layer, ten feature maps are extracted using the filters and the ReLU activation function. The second convolution layer produces output features of shape 60 × 60 × 10, and it is then transferred to the second Max pooling layer. The second Max pooling layer also uses the filters of size 2 × 2 with stride [2,2] and produces the output of shape 30 × 30 × 10. The output of the second Max pooling layer is fed into the third Convolutional layer. This Convolution layer uses filters of size 3 × 3 with stride [1, 1] and padding to extract ten feature maps. The output features of dimension 30 × 30 × 10 produced after the third convolution layer are then passed through ReLU activation followed by Max Pooling using 2 × 2 filters with stride [2,2] and padding. The output generated by the third Max Pooling layer is of the dimension 15 × 15 × 10. The output of dimension 15 × 15 × 10 from the third Max pooling layer is flattened and then fed to a fully connected neural network containing three layers, including the output layer. The first two layers of the fully connected NN are called the hidden layers. Layer 1 and layer 2 of fully connected NN use the ReLU activation function, and each layer contains 64 and 32 nodes, respectively. The final layer of the neural network is called the output layer, which has two nodes representing the two classes of the classification problem (clear echoes and precipitation echoes). The output layer uses the softmax function (Wang et al. 2018) for predicting the classes along with their probability value which is calculated using the mathematical Eq. 2.
where N is the number of classes, x 1 , x 2 , ...x N are the input values and f x i is the output value that gives the probability of class i which will be in the range of 0 to 1. In our case, since we are using binary classification, the value of N is 2. The summary of the CNN model is as shown in Fig. 5. The feature maps obtained from the output of three convolutional layers by giving sample DPS images are shown in Fig. 6.

Training the proposed CNN model
For efficiently training a CNN model, a large amount of sample data is required. A balanced dataset containing 10,000 images (5000 images each for non-rainy and rainy classes) created with the help of domain experts is used for training and testing the CNN model. After preprocessing the image dataset, it is divided randomly into three subsets: training set (70%), validation set (10%) and testing set (20%). The training set is employed for training the neural network model to learn the relationship between the inputs and the output class labels by adjusting the network weight parameters. The validation set is used for tuning the selected model during training. The testing set contains data samples unseen by the model and is used to evaluate the model performance. The CNN model is trained using a supervised approach that uses the input data and the actual class labels for training. An adaptive learning rate optimization algorithm called Adam optimizer (Kingma and Ba 2014) is used for training the model. Moreover, the most common binary classification loss function called binary cross-entropy is used to compute the loss between actual labels and predicted labels to optimize model performance. The mathematical definition of the loss function used in this work is given in Eq. 3. Here y is the label and the value for class clear and precipitation is 0 and 1 respectively. Also, p(y) is the probability for one class, and 1 − p(y) is the probability for the other class.
During the training process, the samples from the training sets are divided into equal size batches (40 samples each), and for each iteration, only one batch is used to train the network. In a single training epoch, all batches are fed to the model one by one in 175 iterations. The weights and biases of the proposed model are iteratively adjusted to  Fig. 7, after 50 epochs, the proposed model has model loss and accuracy close to 0 and 1, respectively. Also, by observing the training and validation plots, we can discard the possibility of overfitting, as the proposed model works well on both training and validation data. Once the model training and validation are over, the inference can be used to classify Doppler power spectra with less computation time.

Results and performance analysis
The skill of the proposed CNN model is estimated using the technique of k-fold cross-validation with the value of k as 5.
In k-fold cross-validation, the dataset is divided randomly into 5 groups. During the first fold, one group is hold-out for testing, and the remaining groups are used to train the model. After training, the hold-out dataset is used for evaluating the model, and the evaluation scores are obtained. The model is then discarded, and the above process is repeated 4 more times by choosing a different group as hold-out data. The training accuracy and loss of each fold are shown    Fig. 8, and the validation results obtained during each fold are shown in Table 2. Box plot shown in Fig. 9 gives a better understanding of the spread out of the accuracy and loss values obtained on each fold. The mean values of the accuracy and loss in the k-fold validation are 99.77% and 0.0184, respectively, and their corresponding median values are 99.8% and 0.0234 respectively. The performance of the proposed CNN model is compared with the four traditional supervised machine learning (ML) classification algorithms such as Naive Bayes (NB) (Rish et al. 2001), Support Vector Machine (SVM) (Cortes and Vapnik 1995), Decision Tree (DT) (Breiman et al. 2017), and K Nearest Neighbour (KNN) (Cover and Hart 1967). In ML approaches, the features from the image have to be extracted manually for training the model. In this study, we have extracted features directly from the Doppler images using efficient image features extraction methods such as Histogram of Oriented Gradients (HOG) (Dalal and Triggs 2005), Hu Moment (Hu 1962), Gray Level Co-occurrence Matrix (GLCM) (Haralick et al. 1973), and Local Binary Pattern (LBP) (Ahonen et al. 2004). HOG is used to extract features related to the edges, and LBP gives the surface texture information. Hu Moments and GLCM provide information about the shape of the objects in the image. The extracted features are then combined to form the feature set for training the machine learning models. As the number of dimensions in the combined feature set is high, it may cause the dimensionality curse problem (Verleysen and François 2005). So, feature selection followed by principal component analysis is used to reduce the dimensionality of the feature set. The reduced feature set is then used to train the ML classifier models such as SVM, KNN, DT and NB. The trained ML models are then evaluated on the testing dataset, and their performance is compared with the proposed CNN model.
The classification performance evaluation metrics used in this paper to compare the various experimental model's performance are confusion matrix, precision, recall, F1-score, and accuracy. The confusion matrix is a matrix that provides a simple and easy-to-understand visualization of the classifier model performance.
Each row of the matrix shows instances of the actual class, and each column shows instances of the predicted class. In the confusion matrix shown in Fig. 10, the entry A p P c , represents the number of instances of actual class precipitation predicted as class clear. The principal diagonal cells (A c P c and A p P p ) of the matrix give the number of correct classifications, and secondary diagonal cells (A c P p and A p P c ) give the number of wrong classifications. A good classifier model will have higher values in principal diagonal cells and lower values in secondary diagonal cells. The confusion matrix of various experimental models on the test dataset is shown in Fig. 11. The obtained matrix results show that the proposed CNN model is better than the other experimental ML models for our DPS image dataset.
The value of performance metrics such as precision, recall and F1-score ranges from 0 to 1. The higher the ratio, the better will be the model performance. The precision class x gives the ratio of correctly predicted instances of class x to all instances predicted as class x (see Eqs. 4 and 5), whereas recall class x gives the ratio of correctly predicted instances of class x to all instances that belong to actual class x (see Eqs. 6 and 7). The weighted average of precision and recall is the F1-score, and is calculated using Eqs. 8 and 9.
From the comparison results shown in Fig. 12  The x-axis of the ROC curve is the TPR (True Positive Rate) which is same as the recall (see Eqs. 6 and 7), whereas y-axis is the FPR (False Positive Rate) (see Eqs. 10 and 11). The ROC curve of a class is plotted by calculating the TPR and FPR at different threshold points from the class's prediction probability, which ranges from 0 to 1. A good classifier model should have high TPR and low FPR. The information from a ROC curve can be summarized using AUC, which measures the area under the ROC curve. An ideal classifier will have an AUC value of 1. The ROC-AUC plot of the proposed CNN model is shown in Fig. 13, and the AUC value obtained for both classes clear and precipitation is 1 (value approximated to 4 decimal places). (10) The accuracy mentioned in this paper is the overall accuracy, and it is the percentage of correct predictions on total predictions or observations (see Eq. 12). The accuracy results obtained from different ML classifiers and the proposed CNN model on the testing dataset are tabulated in Table 3. The overall accuracy of the proposed CNN model is 99.60%, and it shows better skill in Doppler spectra classification as compared to other models. SVM also performs well, though the classification accuracy (98.70 %) is lower than CNN. The classification accuracy of KNN, DT, and NB classifier models are 98.45%, 96.75%, and 93.30%, respectively.
The model training, testing, and experimental study were conducted on an AMD Ryzen 5 4000H processor with 6 cores, 12 threads, 3 GHz clock speed, and a 16 GB DDR4 RAM. The total time for training the proposed CNN model in a system with the specification mentioned above is approximately 11 min. Also, the time for classifying a new (12) Overall Accuracy(%) = A c P c + A p P p A c P c + A p P p + A c P p + A p P c × 100 Fig. 11 Confusion matrix of various experimental models on test data DPS image input using the trained model is approximately 1.5 ms. We have conducted experiments to understand the impact of data size on the model training time and test accuracy (see Fig. 14). The experiments showed that training time and test accuracy increase with data size, and trendline analysis of the plot revealed a linear trend.
In order to understand the significance of preprocessing in the proposed model architecture, an experimental study was conducted. The study results show that the preprocessing steps have improved the classification accuracy of the proposed CNN model when the intensity of precipitation signatures is low (see Fig. 15). Also, analysis using the image quality metric SSIM (Structural similarity index measure) (Setiadi 2021) is employed to understand the importance of each stage of the preprocessing pipeline (see Table 4). Applying wiener filtering after histogram equalization has    to visualize and highlight the discriminative regions detected by the proposed CNN model for the classification of any given DPS image (see Fig.16). By observing the CAM results, it is clear that the proposed model can correctly identify discriminative regions and the key features in the DPS images.

Summary and conclusions
A unique stratosphere-troposphere wind profiler radar at 205 MHz has been operational at the tropical location of Cochin since 2017 January, probing an altitude range of 315 m to 20 km. The primary aim behind the setting up of 205MHz radar at this particular location is to understand the circulation features associated with the Indian summer monsoon. This WPR is the one and only WPR operational in the World at this frequency. The Doppler power spectra obtained from radar during rainy and non-rainy conditions have different features. The Doppler power spectra obtained will have the signatures of ambient air motion and fall velocity of raindrops during rainy conditions.   In contrast, DPS contains signatures of only ambient air motion for non-rainy cases. In order to classify the Doppler power spectra for rainy and non-rainy conditions, an automated approach is necessary. In this work, we have implemented the convolution neural network to classify the Doppler power spectra from the 205 MHz wind profiler for the first time. The accuracy of CNN for DPS classification is assessed using various metrics such as precision, recall and F1-score, and all these values are close to 1. The performance of CNN is compared against other well-known image classifiers that use machine learning algorithms such as Support Vector Machine, Decision Tree, K Nearest Neighbour, and Naive Bayes. It is found that the image classification accuracy of CNN is better than the other models and is around 99.60%. The proposed CNN model can automatically extract key features from the DPS image and perform better than the combined classification efficiency of four powerful traditional feature extractors paired with standard ML classifiers. One of the challenges existing with designing a simple optimal custom CNN model is fixing the number of convolution layers, pooling layers, activation layers, and the number of hidden layers and nodes in the fully connected part. For this, we have experimented with different combinations of these layers and found out that we get better results with the proposed CNN model with three convolution layers, three max-pooling, three ReLU layers, and three dense layers. In this work, we have used an exhaustive sweep search to perform hyperparameter tuning.
The DPS images that the proposed model misclassified are shown in Fig. 17. The noise in the DPS images that creates multiple peaks similar to rainy cases has caused the misclassification of clear DPS (actual class). The misclassification of precipitation DPS (actual class) is due to the low intensity of precipitation signature. However, the proposed CNN model can classify DPS images with fewer misclassifications than the other machine learning-based models. Rather than applying preprocessing on DPS image, it is advisable to apply signal preprocessing on raw data due to the specific nature of the signal and noise in radar echoes. With future research that focuses on signal-based preprocessing techniques to reduce noise and improve the signal strength of the raw data before converting it into the DPS image, the classification performance of the proposed model can be improved.