Image acquisition
Images of peanut plant (Arachis hypogaea) and three common broadleaf weeds in peanut fields namely Velvetleaf (Abutilon theophrusti), False daisy (Eclipta alba), and Nicandra (Nicandra physalodes) were acquired using an affordable cell phone camera having a resolution of 2240×1344 pixels. The images were captured vertically downwards from a distance of 40 cm above the crop rows. A white cotton sunshade was used to let the image acquisition scene be illuminated by diffused sunlight and to prevent leaf shadows on each other. A total number of 100 images were prepared for this study. Fig. 1 shows sample images of the studied plants.
Feature extraction
I order to extract image wavelet information of four mentioned plant types, fifty blocks of 100×100 pixels were manually cropped from images of each of plant (totally; 50×4=200 image blocks) using Photoshop CS6 software (Adobe Systems, USA). Original field-captured Images were loaded in Photoshop and the blocks were extracted carefully from different regions of the desired plants using “marquee tool” in the Photoshop toolbar, without any change in resolution and color values of images. Region of interest selection using different region selection tools of Photoshop software have been reported in several image processing related researches [44, 47, 48]. The cropped blocks were introduced into image processing toolbox of MATLAB software version 2018a (The MathWorks, USA).
The prepared image blocks were transferred from RGB into HSV and L*a*b colour spaces using conversion functions of “rgb2hsv” and “rgb2lab” in MATLAB software [27, 49, 50], and the average and standard deviation values of Red, Green, Blue, Hue, Saturation, Value, Luminance, a* and b* colour components were extracted.
In this study, Gray-Level Co-occurrence Matrix (GLCM) algorithm, which is a statistical texture analysis approach, was employed for extracting texture features from images. In order to extract such features, the color blocks were converted to gray-scale images using “rgb2gray” function and the Gray Level Co-occurrence Matrices (GLCMs) were contracted from gray blocks. GLCM represents the distribution of co-occurring values at given offset over an image. In this study, GLCMs were calculated for a spatial distance of one pixel in four different directions (0, 45, 90, and 135 degrees). Average of four resulting GLCM was calculated and used for feature extraction from the related-grayscale image. In this study 16 different GLCM-based texture features namely; entropy, energy, inertia, correlation, homogeneity, dissimilarity, sum of squares, sum of averages, sum of variances, sum of entropies, difference variance, difference entropy, cluster shade, cluster prominence, inverse difference moment, and maximum probability were calculated from GLCMs and used for plant classification. These features have been previously described in detail and used by several researchers [44, 51-55].
In the other part of the study, one-level two-directional Haar wavelet transform was applied on the gray-scale images of blocks and four sub-images were derived. The resulted subimages were approximation, horizontal details, vertical details, and diagonal details which are also called LL (Low-Low), LH (Low-High), HL (High-Low), and HH (High-High) sub-bands respectively. Haar wavelet, which is a same wavelet as Daubechies db1, known as the first, the simplest, and the fastest wavelet type [56].
After obtaining the wavelet subbands, the above-mentioned texture features were also extracted from each of the subband images. Therefore 64 wavelet-texture features were extracted for each block (16 texture features × 4 wavelet subimages) and used for classification of peanut and weeds.
Feature selection
There was a very large number of input features to be fed into the classifiers (18 color features + 16 texture features + 64 wavelet based texture features = 98 input features). Large dimension of input data can affect the performance of classifiers as the most of pattern recognition techniques are originally not designed to cope with large amounts of irrelevant features [57]. Therefore, two feature selection methods were applied to input feature dataset in order to identify the most relevant features (or feature vectors) and to eliminate the redundant to obtain a higher classification performance: Principal Component Analysis (PCA), which is an unsupervised feature selection method, and Correlation-based Feature Selection (CFS), which is a supervised feature selection method [58, 59].
Decision tree (DT)
Three types of DRs, including J48, Reduced Error Pruning (REP), and Random Tree (RT), were applied for distinguishing different plants. The J48 DT-inducing algorithm is an implementation of the well-known C4.5 DT in WEKA software [60]. The C4.5 tree which is known as one of top 10 data mining algorithms, chooses appropriate features and nodes based on the information gain ratios [61, 62]. REP tree classifier is a fast DT learner that builds a tree based on information gain with entropy and prunes it using reduced-error pruning [63]. RT is a DT that it’s nodes are constructed using randomly chosen attributes and the class probabilities on each node are based on back fitting with no pruning [64].
DT Performance evaluation
The original datasets were splitted randomly into three subsets including training (60%), cross-validation (20%) training (20%) data. The most successful DT models were selected preliminary based on having the highest value of classification accuracy on training dataset. Classification accuracy shows that how close to the observed data are the predictions given by a classifier. This criterion was calculated using equation 21 [65].
Where, TP, TN, FP, and FN denote, respectively, true positive, true negative, false positive, and false negative measures in the confusion matrix of the classifier. In this equation, true positive was the number of positive samples (e.g. peanut plant) that classified as positive. True negative was the number of negative samples (e.g. weeds) that classified as negative. False positive was the number of negative samples classified as positive, and false negative was number of positive samples classified as negative. The accuracy metric indicates the overall effectiveness of the system considering positive and negative samples [66].
In addition to classification accuracy, two other statistics were calculated for DT models to be used in the probable cases in which there were DT models having similar classification accuracies. These two statistical indicators were Cohen's kappa coefficient, and Root Mean Squared Error (RMSE), that were measured on training dataset and used to evaluate the performance of the developed classifiers. Kappa is a statistic which measures inter-rater agreement for categorical items. It is generally thought to be a more robust measure than simple percent agreement calculation, as kappa takes into account the possibility of the agreement occurring by chance [67]. This criterion was calculated using equation 22;
Where; Po is the relative observed agreement among raters, and Pe is the hypothetical probability of chance agreement. More detailed information about this equation is revealed by Bonhomme et al. [68]. RMSE is a measure of the differences between the actually observed data and those predicted by the model [65, 69].
Where; Yobs,i and Yobs,i are respectively the ith observed and predicted data from N total data and. In cases of equal classification accuracies, the models that resulted in higher R2 and lower RMSE values, was selected as the superior models.
DT-based fuzzy system
In order to define fuzzy membership functions and fuzzy rules implementation, it was necessary to enter the obtained DT structures into the fuzzy model. For designing the antecedent and consequent parts of the fuzzy rules, the branching of DTs was considered. The most suitable trees were selected based on two factors; the highest accuracy, and the simplest structure. Simple structure of DT is very important in defining the rules and membership functions of DT-based fuzzy system. The input variables and their membership functions were defined according to the nodes and their related threshold values on DTs structure.
A Mamdani fuzzy model was designed and implemented in the fuzzy toolbox of MATLAB software based on the structure of the selected DT. The antecedent and consequent parts of the fuzzy rules were developed an adjusted according to the branches and leaves of the DT, respectively. Also, the nodes and threshold values of the DT were set as the input variables and those related membership functions in the fuzzy system. The classification model was designed and implemented in the fuzzy toolbox of MATLAB software.