Hybrid Enriched Stacked Auto Encoder and FFO-MPARCNN Algorithm for the Multispectral Image LULC Classication

: The satellite imagery classification task is fundamental to spatial knowledge discovery. Land Cover and Land usage (LULC) maps are created using a variety of image classification techniques, making it easier to conduct research on spatial and ecological processes as well as human activities. One of the most well-known applications of geographical monitoring is LULC classification. Owing to its improved feature learning and feature expression capacity, the convolutional neural network (CNN) has made several breakthroughs in feature extraction as well as classification of multispectral images in recent times as compared with conventional machine learning approaches. But on the other hand, standard CNN models have certain disadvantages, for instance, a large number of layers, which contribute to difficult computing costs. The Hybrid Enriched Stacked Auto Encoder and Pre-Activated Residual Convolutional Neural Network combined with a Fruit Fly Optimization Algorithm (HESAE-FFO-MPARCNN) has formulated where FFO used to optimise parameters and thus enhance the accuracy of classification in this work to tackle this issue. The designed FFO-MPARCNN model with its modified hyperparameters produces higher classical models as PB-RNN, ResNet and FHS-DBN for computational efficiency and accuracy of classification.


Introduction
One of the most important techniques in the quantitative analysis of remotely sensed images is multispectral image classification [1]. A pixel's characteristics are normally reported over a variety of spectral channels in multispectral images. The ground cover is typically mapped from remotely sensed data using a supervised image classification system. The relevant subfield of land cover monitoring is LULC classification. LULC refers to the method of categorising remote sensing images hooked on different land cover groups, for example, plantations, water, built-up areas, roads, and forests [2]. To identify satellite images or generate accurate maps from satellite data, Machine Learning (ML) algorithms for example k-means clustering, as well as maximum likelihood classifier [3], historically used. That being said, over the last decade, consumption of such deep learning-based methodologies to resolve domain-specific difficulties has grown in popularity also effectiveness [4]. Particularly in recent days, researchers utilized the Deep convolutional neural networks (DCNN) especially for computer vision tasks [5] besides are well suited for computer revelation problems for instance semantic segmentation of images, as they can acquire low-level as well as high-level features in a hierarchical fashion.
Many LULC classification studies have been reported in recent years utilizing multi-spectral LiDAR data [8].
Higher point density has been reached with the advent of LiDAR technology, implying a heavier numerical workload as the LULCclassification is done. While the above approaches have provided core land-based classificatory technologies for LiDAR data, those that are already researching with standard low-level LiDAR for example instability, entropy, and skewness metrics unable to obtain detailed high-level characteristics LiDAR data. Nowadays, deep learning strategies emphasis on network architecture layer depth as well as decrease the fittings of every other layer may draw in deep, high-level features from the original data which lead to enhanced accuracy and consistency in classification [9]. Credibility for image processing in conventional deep learning architectures has been developed for the convolutional neural network (CNN).
From instruction to forecasting, CNN, a neuroscience-inspired multi-layered, deep learning framework, features a separate neural network. CNN learns spectral and spatial information from images that use stacked convolution kernels to obtain abstract characteristics of a high level. As even the neural network goes deeper, the CNN performing classification responsibilities through founding a relationship among input image samples and output labels. In many other vision-related tasks which including Visual Detection [11], scene labelling [12], including face recognition [13], because of the obtainability of a large amount of training data with the usage of effective enactment workplaces. The CNN model outperformed many standard neural network approaches (namely auto encoding, Sparse coding, restricted Boltzmann machine).
Throughout the model training stage, non-hyper-parameters are constantly modified which are dimension, input distance, number of convolution kernels, convolution kernel size, learning rate and pooling window size, which have a huge effect on the training and final projections. As a result, constructing a powerful CNN model and determining the necessary parameters for LULC classification responsibilities is extremely difficult and timeconsuming. That being said, there are no simple guidelines for optimising CNN hyper-parameters at the moment, and they are often decided by a designer's knowledge and instincts [15].

The Hybrid Enriched Stacked Auto Encoder and Modified Pre-Activation Residual Convolutional Neural
Network with Fruit Fly Optimization Algorithm (HESAE-FFO-MPARCNN) are proposed in this work to reduce the computational complexity in the multi-spectral image-based LULC classification mission. In brief, the following are the key contributions of the proposed multi-spectral LULC classification design: • Aimed at the multispectral land cover classification challenge, the HESAE-FFO-MPARCNN model was developed.
• The hyper-parameters of the built ESAE-MPARCNN model are addressed as well as configured using the FFO approach to deliver instructions for multi-spectral LULC classification.
Below is a description of the paper's structure: Work on machine learning and especially deep learning-based methods for classification in general, and LULC mapping and interpretation in particular, is discussed in Section. 2. The HESAE-FFO-MPARCNN model for multi-spectral sentimental-2 image-based land cover classification is defined in Section 3. The results of the experiment are illustrated and explained in section 4.
Finally, Section 5 contains the closing remarks.

Related Work
Through deep learning outperforming traditional machine learning methods in classifying pictures, the remote sensing group has recently given the increased interest in using these techniques to identify LULC using multispectral and hyperspectral images (HSI). A modern patch-based RNN which is denoted as PB-RNN framework proposed and optimised for multi-temporal remote sensing data by integrating and then using the full multi-spectral, multitemporal, as well as spatial information in remote sensing images besides recognising them receive spatial and sequential interdependence [16].
In [17], a recurrent residual network (Re-ResNet) architecture was introduced that can acquire a joint spectralspatial-temporal feature exemplification surrounded by a centralized context. A residual convolutional neural network (ResNet) and a recurrent neural network (RNN) are merged into a single endwise architecture to achieve this goal. In [18] suggested a novel framework for multispectral and panchromatic image recognition using adaptive multi-scale convolutional neural networks and a perceptual loss feature.
In [19] proposed a deep learning-based multi-spectral LULC classification system. A spectral-texture classification model is built using the contourlet transform's excellent detail capture ability to acquire possible details to complement the spectral feature space, coupled with deep learning for feature selection and feature extraction. In [20] focuses on land use classification and suggests the Firefly Harmony Search (FHS) optimization algorithm for training the Deep Belief Neural Network (DBN).
To identify the HSI, [21] used a hybrid stacked autoencoder (SAE) architecture and support vector machine (SVM) classifier. The algorithm is modified based on a convolutional neural network in [21], then tests are conducted on multi-source remote sensing images of various geomorphologies occupied further down three separate climatic conditions to validate the Improved convolutional neural network's efficacy and scalability.
Concentrating on the classification of LULC from multispectral and hyperspectral images, this paper offers a state-of-the-art analysis by incorporating several different approaches documented in the literature into a standardized deep learning system that addresses various aspects of the issue. The power-scale and energy conservation, on the other hand, is a key concern for onboard processes. As a consequence, reducing the complexity of the models is an important concern for future work.

Proposed Methodology
The feasibility of a HESAE-FFO-MPARCNN was investigated in this study. In ESAE-DNN, three hidden layers are deployed at the softmax classifier for feature extraction, followed by classification with FFO-

Study Area
The research was carried out in Vietnam's Dak Nong Province, in the Central Highlands ( Figure 2). The climate in the province is humid tropical highland, and it is influenced by the dry, hot southwest monsoons. The research area covers 6516 km2 and is marked by significant disintegration, rendering LULC classification especially difficult. The natural forest is made up of areas of natural evergreen broadleaved, deciduous dipterocarp, mixed bamboo, including semi-deciduous forest with varying degrees of commotion, much of which is caused by humans [23].
The latent representation then, as described in Eq. (2), was mapped back to from the corrupt version The and is known as a nonlinear activation function. The error in restoration reflects the costfunction of the SAE between the initial inputs and the rebuilding . (3) the function of the functionalities. Also, SAE's secret units added a word ρ for sparsity limitations, with additional penalty words being given for the objective feature in Eq. (4). The current cost feature of MSAE can then be updated as, The cross-entropy within cρ and signifies the additional consequence terms Kullback-Leibler divergence In which the target amount of sparsity for the th unit is ; the average rate of activation is ; the number of cached units is and the weight of the terms of penalty is . The input data for which the sparsity dictionary code is used in all layers for obtaining the optimum image character is displayed by defining a set of overcomplete base vectors. Centred on the sparse feature representation at the pixel level, three hidden layer sparse coding approaches [24] reveal the critical feature of an image. The deep network classification can be created by cascading a stacked auto-encoder using the softmax classification to allow the stacked autoencoder to contain 2 or more layers of auto-encoding. Three auto-encoders in the DNN classifier with ESAE are seen in fig. 3. Take as an input , variables as and the corresponding output class the input vector fed through MPARCNN. The training approach aims to adjust DNN parameters for learning input vectors and classifying respective output value in addition to maximum precision, as defined input vectors for training.
Training approach The following steps describe the ESAE-DNN classifier training process of feature extraction.
• The initial input vector with the equivalent target, the vector is trained first on the first autoencoder layer. The input reconstruction attempts of the layer include derivative characteristics with the autoencoder structure.
• The second training autoencoder layer will be performed with the input vector, generating output vector as input and generating output vectors . The output layer preparation is performed with the output vector information. The second layer of autoencoders to rebuild .
• The final layer of an autonomic encoder is trained as the production vector for the second autonomic layer and creates the output vector . The third layer is trained to accept an output vector as a backup. The third layer of input autoencoding efforts is the reconstruction .
• A softmax classifier is cascaded in the autoencoder that is stacked. This layer is trained by consideration, as input vector, and as target vector from training results, of the performance of the third autoencoder layer.
• Backpropagation is essentially used to boost the efficacy of the DNN as far as feature extraction is concerned. This fine-tuning is performed in a controlled manner by retracting the network along with the training results. MPARCNN is used to identify the characteristics after processing and feature extraction. Input Layer convolutional layers embedded in the residual block. This skip link also adds low and high-level functions. The deep network will thus minimise the issue of the gradient disappearing or explosion. Each residual estimate may be defined as (6) In this case, x and is the residual block input images features and output of the class label. A residual study function is indicated by ; the coevolutionary layer output is defined by before summation operation, and f is indicated for the activation function. The batch standardisation (BN) and pre-activation mechanism in the suggested residual network block have been applied for optimum output in this area. Pre-activation architecture implementation is carried out utilizing the activation feature moving BN and Revised Linear Units (ReLU). The pre-activation residual block is expressed as (7) The activation function is ReLU and is defined accordingly. sample of the ingredients on the kth map of the RCL functionality. Also, take into consideration that output network is phase at that time. Then, the output is formulated as (9) In this case, and federated RCL , as data, from the regular convolution layers. The w and values were correspondingly considered the usual convolutional layer as well as RCL of the k th map function. Instead, is represented as bias. Just 1×1 also 3×3 convolution filters were used during this execution, stimulated by IRRCNN [25] model. Furthermore, the minimum amount of network parameters is preserved. The non-linearity of the decision function can be reduced to influence the convolution layer by applying a 1×1 filter. As this is unchanged in IRRCNN blocks, in addition to nonlinearity is applied to the RELU activation functions, the input and output feature sized is essentially an undeviating prognostication of the same dimension. The dropout is 0.5 used subsequently any convolution layer in the transformation block. In the end, Softmax operation can be welldefined for the following ℎ classes for input features of MS sample , weight vector W and L distinct linear functions: A series of tests on several benchmark datasets were performed and results are compared to different models to test the proposed MPARCNN structure. Fig.5 illustrates the structure of the proposed MPARCNN.
stands for Desired value, for Predicted value, and N for the Number of images. The fundamental concept behind FFO-MPARCNN is to use FFO to identify the right MPARCNN parameters.

The Proposed FFO-MPARCNN Classifier
The parameters are represented by the location coordinates of the fruit fly. A scent concentration judgement feature is present in all fruit flies. Classification precision is measured as a fitness function in FFO-MPARCNN.
The MPARCNN candidate parameters are treated as flies' coordinates. As a consequence, all of the parameters should not be too great or too lesser, as well as should just look for the most acceptable parameter within a defined range. However, the relationship is unlikely to be a straightforward mapping relationship, since several extreme points contribute to increased complexity. According to the findings of the above study, the parameter selection of the MPARCNN problem is solved in this phase. Another is that the most appropriate parameters can be identified by searching only in a small area. The additional choice is to go to multiple extremes. FFO is a master at solving problems involving the two characters. FFO first initialises the scent with a small collection of random numbers, then chooses the most relevant MPARCNN parameters from the random numbers based on MPARCNN's classification precision and reports the parameters.
The random numbers will then be modified for the most appropriate parameters, as well as the adjusted number, along with an additional set of random numbers, which would be considered new-fangled candidates for the more appropriate parameters. FFO continues to pick the most appropriate MPARCNN criteria from new applicants based on MPARCNN's classification accuracy. If better parameters are discovered, FFO saves the data, adjusts the candidate parameters to the better parameters, then contemplates the adapted number plus an innovative set of random numbers to be the up-to-date aspirant parameters. Else, FFO will proceed by adding a fresh group of random numbers to the prehistoric random numbers rather than the modified ones. It is easy to see from the above definition that FFO propels the primary random parameters end to end a path towards the furthermost appropriate parameter instead of selecting them randomly. Here will use the optimal parameters determined by FFO to evaluate the model after MPARCNN. Have even let the fruit flies fly at random to help FFO discharge from the local optimum to boost the global searching ability. The FFO-MPARCNN steps are as follows: Step 1: Set up the MPARCNN parameters, for instance, the number of layers, drop out rate per layer, units per layer, and L2 (or L1) regularisation parameters. Set FFO parameters like the swarm position radius, iteration number N, and group size S to their default values. Make ready the location of the fruit fly swarm at random.
Step 2: Using osphresis, give three-quarters of the fruit flies a random path and distance to look for food. then (15) Rand denoting random meaning.
Step 3: Subsequently the direction of the foodstuff cannot be determined, the distance to the origin is measured foremost ( ), followed by the scent concentration decision value (SC), which is the inverse of the distance. (16) Step 4: To encounter the scent concentration ( ) of the individual position of the fruit fly, substitute the SC into the smell concentration judgement feature.

(17)
Step 5: Classify the fruit fly with the largest attentiveness of odour in the swarm.

(18)
Step 6: Record the best scent attentiveness value as well as the x, y synchronizes, and the fruit fly swarm can routine visualization to fly to that spot at this time.
Step 7: Search to see if this process achieved the full number of iterations. Stop the iteration until finished.
Step 8: If not, increase the number of iterations by one, repeat Steps 2-5, and see if the scent concentration is healthier than the preceding iterative smell concentration; if not proceed to Step 6.
Step 9: A neighbour will be chosen at random in each iteration. If the selected neighbour has a lower MSE than the current state, the selected neighbour will be taken and its parameter values will be used as new parameter values, and eventually, the previous best SC value and x, y coordinate will be obtained that are the values of the optimum parameters that must be discovered. After that, enter them into MPARCNN and proceed to the next level.
Step 10: MPARCNN training phase and evaluation of hyper-plane parameters using Eq (14).
Step 11: MPARCNN testing phase, in which the class is assigned to a new data point using the judgement function from step 3 and distances are measured.
Step 12: Evaluate class using Eq. (19), i.e. the class is assigned to the pattern depending on the difference between the respective planes.
Step 13: Bring an end to the process. Understand the proposed FFO-MPARCNN algorithm's mechanism using this flow map, as seen in Fig.5. Measure the decision value for SC and determine the smell concentration.
Discover the fruit fly with the highest concentration of odour in the swarm of fruit flies.
Keep updating the fruit fly swarm's location and distance.

Verify the number of iterations met
As the final optimal parameters, determine the optimal values of the parameters.
To identify the LULC MS images, use the best parameters in MPARCNN. IoU is defined as

Fig.6. Precision performance comparison
The precision comparative results between PB-RNN, ResNet, FHS-DBN, and FFO-MPARCNN are seen in Figure 6. When compared to other approaches, the FFO-MPARCNN approach can achieve a high precision rate.
It is a reliable method of obtaining classification data, with a precision rate of 94%. When comparing the precision of current PB-RNN, ResNet and FHS-DBN have lower precision rates of 90%, 92.5%, and 92.55%, respectively, than FFO-MPARCNN, since FFO will improve the generalisation potential of MPARCNN to increase classification precision. The numerical values of precision performance comparison are given in Table   1.   Table 2.

Fig.8. Recall performance comparison
The recall comparative findings between proposed PB-RNN, ResNet, FHS-DBN, and FFO-MPARCNN are seen in Fig.8. The suggested FFO-MPARCNN process has a high recall value of 95%. It is clear from the findings that the proposed FFO-MPARCNN has a high recall rate value, suggesting a high cluster forming rate.
When comparing the recall rates of the current systems, PB-RNN, ResNet, and FHS-DBN, the proposed work has a lower recall rate of 90%, 93%, and 94%, respectively, indicating that it can have better grading outcomes than the existing system. Most importantly, the use of an FFO and MPARCNN-based classifier in this study solved the issue of poor convergence and significantly reduced computing requirements for MS image classification. This demonstrates the potential utility of the proposed FFO-MPARCNN method in processing such high-dimensional imagery. The numerical values of recall performance comparison are given in Table 3. Accuracy comparison

Fig. 9. Result of Accuracy
The accuracy relation for MS image classification is seen in the graph above (Fig. 9). As the amount of data is expanded in proportion to the accuracy value, the accuracy value increases linearly. From this  Table 4.   Table 5.

Conclusion and Future Work
This work proposed a CNN model for ground cover classification from multispectral satellite images with refined hyper-parameters. This study's major contributions can be summarised as follows: (1)  learning more criteria, and the network is vulnerable to overfitting. This weakness will be resolved in the future by incorporating an additional robust LULC classification system.

Conflict of interest:
There is no conflict of interest.

Funding:
There is no funding information.

Availability of data and material:
There is no availability of data and material.

Code Availability:
There is no code availability.

Author's contribution:
There is no author's contribution.