Weeds Detection and Classi�cation using Convolutional Long-Short-Term Memory

The smart agricultural robotic system can decrease the dependence on various traditional agriculture crop spraying methods such as pesticides, herbicides, and fertilizer. To meet the world population food requirements, conventional schemes are not sufficient for spraying agrochemicals to control the weeds and increase crop production. Therefore, a smart and intelligent farming system is introduced to increase the production of crops and to reach crop production target. In this paper, Deep Learning (DL) based algorithms is applied for the identification and classification of weed plants using combination of Convolutional Neural Networks (CNN) and Long-Short-Term Memory (LSTM). Convolutional Neural Networks (CNN) has a unique structure to get discriminative features for the input images, and LSTM allows to jointly optimize the classification. To validate the proposed scheme, nine kinds of weeds are classified using the proposed method such as vine weeds, three-leaf weeds, spiky weeds, and invasive creeping weeds. We carried out several extensive experiments and 99.36% of average classification accuracy is achieved. The obtained results show that the combination of CCN-LSTM has significantly higher classification capabilities in comparison to other existing prominent approaches.


Introduction
The world population is expected to increase around 9.7 billion by the year 2050.Currently, world's population is around 7.3 billion and that number are still growing according to the United Nations (UN) report [1].The population of this magnitude brings a lot of challenges and sufficient food production is one of the leading challenge among them.The UN Food and Agriculture Organization predicts that we need to boost worldwide food production by 70% over the subsequent some decades to feed the projected population of 2050 [2].It has been observed that approximately 2,50,000 plant species exist and among them there are 250 species are related to the weeds plants [3][4][5].These have a great influence on the reduction of agriculture production [5][6][7].Weed plants are basically a byproduct of desirable crops that grows simultaneously with them.Weed is unwanted plant that consumes water and plant's nutrients/minerals which ultimately reduces the growth of wanted plants that is essential and prime product of the crop.In order to address this problem accurate and robust detection of weed and desirable plants has been carried out in this research article.We leverage CNN and LSTM models to perform the detection.The major classification of weeds plants can be done on two major factors [4,8,9]: first Life span and second Ecological affinities.Life span: Life span weeds are mostly seen in non-cropped areas and their lifetime is for more than two years and they are involving in destroying seeds, underground stems, roots, rhizomes, tubers etc. Ecological affinities: These weeds are grown due to different ecological affinities and appear via seeds in semi-aquatic weather conditions.Agricultural mechanization is the most dominating research field to increase crop yield and controlling of undesirable weeds [10,11].To increase the productivity of crops, the classification of weed plants is an utmost critical part of the intelligent sprayer system [12].The credits of agricultural chemicals include enhanced crop production, but also it has serious health complications, and environment effects [13,14].Precision agriculture machines are required to reduce the excessive usage of pesticides and herbicides [15,16].The weed detection is an attractive field of research for the data scientists, and many machine learning based techniques have been proposed [17].Many researchers utilized computer vision algorithms for identification and classification of the crop and weed plants [18,19].Literature reveals, many hand-crafted and deep learning based models are available and contributed significantly [20][21][22][23][24][25].For the classification of weeds, various classification methods have been proposed such as colour classification approaches for perennial weed identification [18], deep convolutional neural network (DNN) [26][27][28], CNN-based method/approach for separating sugar beet plants and weeds [29], Gabor wavelet [30], Gabor wavelets and neural network [31], Kalman filter [32], decision trees and artificial neural networks [33], hyperspectral imaging with wavelet analysis [34], support vector machine (SVM) [35], Haar wavelet transform and k-nearest neighbour [36], SVM and Bayesian classifier [37,38].Skovsen et al. [39] fully CNN algorithms were presented for the detection of weed and grass.Kmeans feature with CNN was employed to identify weeds by Tang et al. [40].Adel et al. [41] presented support vector machine and artificial neural networks to detect weed plants based on shape-features.These methods have shown impressive improvement in the agricultural sector and yielded remarkable performance.However, effective and advanced classification methods are required to improve the classification accuracy of the weed plants.The classification task is extremely difficult to classify the weeds because the leaves of crops & weeds frequently overlap each other at late growth stages [41].In this research, a deep learning based hybrid approach of CNN and LSTM are applied to detect and classify the weed plants.The main idea of the D-CNN-LSTM is to build a self-learned scheme for learning features and improve the classification accuracy of the weed species.The rest of the paper is organized as follows: Section 2, presents the proposed methodology.Several experiments have been carried out in Section 3. Section 4, describes the obtained results and related discussion, and finally, section 5 concludes the paper.

Methodology for Classification of Weeds
In the recent research work, Convolutional Neural Network (CNN) and long short-term memory (LSTM) models are proposed for the classification of weed plants.The proposed model is divided into three main stages.Initially, CNN is employed to capture the features, then features are input to LSTM, and lastly, these output features are fed into fully connected layers.The details are described in the following subsections.

Convolutional Neural Network (CNN)
Deep Neural networks (DNN) can be formulated as graphs of neurons interconnected by weighted edges.Each neuron and edges are associated with an activation value and weight, respectively.CNN mainly consists of three parts: convolution layer, max-pooling and fully connected layer.The general architecture of CNN models is shown in Fig. 1.The layers in the CNN consist of a series of operations such as convolution layers that perform a convolution operation on a data array (called a feature map) and pooling operations.There is also a rectified linear unit (ReLU) that is used as an activation function.Particularly the automatic feature extraction ability of CNN models extends their norm to a widespread variety of research applications [42].Many state-of-art deep CNN models have been successfully evaluated in different fields such as AlexNet [42], GoogLeNet [43], ResNet-50 [44], Inception-v3 [45], VGGNets [46], Squeeze-and-Excitation (SE) networks [47].In 2012, AlexNet provided a huge leap forward in reducing the classification error rates compared to previous models [42].In 2014, the winning algorithm (GoogLeNet) was adopted as an improved model to reduce the error rate even further [43].The layers in the CNN model includes: Convolution Layer: Convolutional layer is the core layer of CNN; it produces new images called feature maps.The feature map emphasizes the unique features of the original image.Considered a 2D input image weed plant, such as a matrix X n1×n2 and the filter matrix  1×2 , where  1 ≤  1 and  2 ≤  2 , then the matrix  =  *  is the result of 2D convolutional on of  with  and mathematically can be defined as: Pooling Layer: The pooling layer reduces the size of the image, as it combines neighbouring pixels of a certain area of the image into a single representative value.Normally, two types of pooling layers have been used one is max-pooling, and another is mean-pooling.In this research work, max pooling is used and mathematically can be represented as: Long short-term memory (LSTM) is an artificial recurrent neural network (RNN) architecture [48].
In this paper, the peephole LSTM model is used for feature extraction of weed plants.Considered, σ is the sigmoid function, W and b are the model parameters and g t is the nonlinear transformation of the inputs.c t (t represents step time) is the memory cell to store the internal state of the LSTM component, and x t & h t are the inputs and outputs of LSTM, respectively.The LSTM consists of an additional three gates including input gate i t , output gate o t and forget gate f t .The LSTM approach mathematical can be represented as: Where '⊙' denotes the Hadamard product.The memory cell in an LSTM network is the central concept that allows the network to maintain state over time and it can learn long-range dependency from the input sequences.

Proposed CNN-LSTM classifier
In order to achieve a high classification accuracy of weed plants, CNN and LSTM techniques are merged together.Normally, pre-trained DL networks have been used in many applications and also it is an extremely operational practice in the DL field.In this research, we proposed a method particularly for the classification of the weeds and trained using over seventeen thousand weeds plant data.DL consists series of layers, convolution layer, batch normalization layer.Max-pooling layer, rectified linear unit layer, LSTM layer, and soft-max layers have been used for the classification of the weeds.The proposed algorithm is illustrated in Fig. 2. In the proposed CNN-LSTM based architecture, initially, 4 convolutional layers have been used.In the first convolutional layer, 10×10 frequency-time filter has been employed.Similarly, in the 2nd layer 9×9 filter, 3rd layer 5×4 filter, 4th layer 4×3 filter respectively.For each convolutional layer, batch normalization and rectified linear units were used.A 3 size of the pooling layer has also been applied for each of the convolutional layers.Basically, LSTM unit comprises a cell, and three gates forget gate, input gate, and output gate [49].The main role of the cell is to remember values over random intervals of time, and all the three gates control the flow of incoming and outgoing information of the cell.In this research, four layers are used in the LSTM network of 32-unit size.To adjust the output data of LSTM a flatten layer is used which also creates an interface to communicate among the dense layers.

Experiment and Evaluation
Weeds Dataset: Initially, the dataset was collected by an Australian research group [50].They collected from different regions of northern Australia.They established two main goals to achieve the required variability and generality of the dataset.First, the collection of different crop species, which is around 1,000 images, it helps in training at high-complexity if CNNs that require large labelled data sets.Second, split them equally into positive and negative classes of each location.Further, it helps to prevent over-fitting of developed models to scene level image features by ensuring targets are identified from their native backgrounds.Finally, the data set required expert analysis to 17,509 labelled each image as to whether it contains a target weed species or not.The data set consists of images of nine weed species and their respective numbers are shown in Table 1, Figs. 3 and 4.   The learning rate plays an important role to learn and validate the deep learning network.In this research work, an optimal learning rate finder has been used to smoothly optimize the learning process.If the learning rate is too small, then the optimization process takes more time to learn or if it is too high then the optimizer may overshoot or it becomes wickedest diverging.To evaluate the performance of the model, the 5-fold cross-validation has been used.The data sets were split into 60%-40% for training and validation, respectively.The model is trained using each of the five epochs and trained 20 times, and the validation accuracy has been recorded for each training and validation.For the experiment, the HP computer has been used and its specifications: 16GB install memory (RAM), Intel(R) Core™ i7-4790 CPU @ 3.60GHz Processor and Window 64-bit operating system.

Results and Discussion
We applied an ensemble of CNN and LSTM methods in order to classify the weeds as represented in Table 1.The results of the proposed method are shown in Table 2.It explains the classification accuracy of each weed.The achieved results are promising, as their average classification accuracy is above 98% except for Negatives weeds.In two weed cases -Parkinsonia and Prickly acacia, we have achieved 100% of accuracy.It shows that the proposed method is much capable to classify the weeds accurately.The reason behind this is that LSTM helps in generating the textual description from image data sets, whereas CNN provides features from the data sets.Further, LSTM interpreters they, and this ensemble approach ultimately quite helpful in order to classify the weed data sets.From the results, we get the impression that all the weeds are necessary to classify for better agriculture production.
Application of deep learning methods on agriculture data set will help in farm management systems with real artificial intelligence systems, providing strong recommendations and insights for the increase of agricultural crops and their production, and also save them from the unwanted crops as they can decline the productivity of crops.Here, the model iterates multiple times, but the classification rate has been determined after its 90 th iteration.For training and validation progress, we applied the 5-fold cross-validation method on the proposed scheme as shown in Fig. 6 and 7.In the future, it is expected that the CNN and LSTM models will be even more widespread on agriculture data sets, and allowing better productivity.The limitations of the proposed work are that due to heavy data sets, it requires high-end computation time on both methods either by augmentation or without augmentation.To overcome, we suggest the future researcher before training the data set, split them into epoch and execute each epoch as a training data set in order to validate the classification accuracy.Where each epoch will generate a different predictive model.

Performance Comparison
For justification of the proposed method is compared to the same data sets, the recorded results are compared with recently revealed methods.The comparative results are illustrated in Table 3.The Inception-v3 model was used to classify the nine weed plants and the average classification accuracy is 95.1%.Similarly, the ResNet-50 deep learning model was also investigated using nine weed plants and the average results were recorded about 95.7% [50].As illustrated in Table 3, the results of the proposed method are higher compare to Inception-v3 and ResNet-50 in both test methods -without, and with data augmentation.The augmentation scheme contributes an effective solution to improve the dataset for improving the performance of the networks in weed data classification.Therefore, we performed experiments in both ways -with and without data augmentation.The claimed accuracy of with data augmentation is 99.36%, and without data, augmentation is 96.06% respectively, as highlighted in Table 3 and Fig. 8.This clearly states that the proposed method performs better in with data augmentation, and the classification accuracy is a bit higher without data augmentation.However, results, we suggest it needs to be more improvement for classification when experiments are done on without augmentation results.The original training image (a) and the images are generated using the augmentation scheme (b).
Classi cation error of the proposed CNN-LSTM method.
Validation and training accuracy of the proposed CNN-LSTM method.
Comparatives results with and without augmentation

Figure 1 .
Figure 1.Deep learning architecture for detection of weeds

Figure. 2
Figure. 2 The proposed classifier based on CNN and LSTM.

Figure 5 .
The original training image (a) and the images are generated using the augmentation scheme (b).

Figure 6 .
Figure 6.Classification error of the proposed CNN-LSTM method.

Figure 7 .
Figure 7. Validation and training accuracy of the proposed CNN-LSTM method.

Figure 8 .Figure 1
Figure 8. Comparatives results with and without augmentation

Figure 2 The
Figure 2

Figure 3 Distribution
Figure 3

Table 1 .
Name of the weed plants and their respective representations.

Table 2 .
Confusion matrix for the classification accuracy of nine weeds plants.

Table 3 .
Comparison of classification accuracy (%) the weed plants with the current approaches.