General Architecture Overview
In the research on classification of plants from the manually collected data, two-level architecture was proposed. This levels consist of supervised crop classification and crop sort, including the best algorithm for classification, optimisation and activation. These levels are supervised. Pre-processing is another critical phase in the research. Since data has been obtained from multiple sources, issues, such as numerous data units and missing values, can emerge. The next step, the regional prediction, is the central part of the work on an improved classification model of the neural network. The techniques used to maximize the amounts of data in which the data augmentation process is carried out in order to treat data that is increasing for processing are the addition of slightly updated copies of existing or newly generated data. In planning a machine, it helps reduce exercise. Figure 1. The two levels of architecture are explained separately in order to enable farmers to follow the same style in Figure 2 in relation to cultivation, crop type and similar districts.
Study area and materials
The state of Tamil Nadu in India was selected as the study area. The most demanding and difficult job is to gather data in agriculture. Comprehensive analysis results, case studies, the state's official agriculture website, and other websites were gathered for the report. The study contained 106 specimens of vegetable crops and 61 other crops, including cereals, millets, pulses, fiber oilseeds, sugar and drilled crops. This study was focused on 106 samples of plants. But seasonal crops were not included in the current report. The classification model attributes selected are given in Table 1. Agro climate parameters that have a high effect on the crop forecast. The crop model is obtained and graded based on crop seeding for 33 districts in Tamilnadu. The data collected is published in mendeley data http://dx.doi.org/10.17632/zyyb98msjc.1
Table 1
Attributes of crop
Attribute of crop
|
Features
|
Type of Crop
|
Cereals, Millets, Pulses, Oil seeds, Fibre crops, Sugar crops, Forage crops, Cole crops, Vegetables, Root & Tuber, Green & Leafy, Bulb vegetables, Minor vegetables and Other crops
|
Type of Soil
|
Alluvia, Loamy, Sandy, Clayey, Black, Red, Sandy loamy, Black cotton soil, Clay loamy, Well-drained loamy, Heavy cotton, Silt loams, Well-drained sandy, Lateritic, Friable, Well-drained
|
Soil Ph
|
Low and High
|
Duration of crop
|
Maximum and Minimum
|
Temperature
|
Maximum and Minimum
|
Relative humidity
|
Low and High
|
Districts
|
29 disticts of Tamilnadu
|
Feature Selection
The goal of the collection of attributes is to check for a valuable collection of comparable attributes if all the attributes are used, the classification outcomes would be the case. Moreover, a smaller collection of attributes generate less complex patterns which humans can readily grasp and even imagine. Random forest algorithm is used for selecting the features. Random forests are a mixture of tree predictors, which allows each tree to be individually and uniformly distributed by the value of a random variable sampled over all trees in the wood. The forest generalization error converges with as. to a cap that increases the number of trees in the forest. The error of generalization of tree classifiers forests depends on the power and interaction of the individual trees in the forest. The use of a random range of functionality to separate per node creates error rates that are more stable with regard to noise than Adaboost. Here the attributes with score less than 1.5 is not considered but all the attributes having low score hence all attributes are selected to perform classification which is shown in figure 2.
Pseudocode
DecisionTreeObj = DecisionTree(MutatedDataset)
for feature_score in DecisionTreeObj.feature_scores:
if (feature_score < np.std(DecisionTreeObj.feature_scores) * 1.5):
MutatedDataset.drop(feature)
Pre-processing and Data Augmentation
The quality of input data can be improved through data pre-processing, also as data preparation, which in turn affects performance and analytical efficiency of the results. In this step, the data was converted into the similar format to enhance features. The following processes were performed: The unit of temperature for some crops were in Fahrenheit and some crops were in Celsius; hence, everything was converted to the same unit. For experimentation purpose, Data augmentation process is carried out based on minimum and maximum values, each crop data row was duplicated into fifteen rows with 0.1 increase and decrease in their values for analysis.For missing values in a data set, an average value was used based on available data.
Pseudocode
Input: 106 crops with their actual values
Output: other than categorical values all other parameters are increased by 0.1 based on minimum and maximum value and thereby creating 15 samples for each and every crop
-------------------------------------------------------------------------
for row in range(len(dataset))
for delta in range(0,1.5,0.1):
MutatedDataset = MutatedDataset.append(dataset.iloc[row] + delta)
for delta in range(0,-1.5,-0.1):
MutatedDataset = MutatedDataset.append(dataset.iloc[row] + delta)
return (x_train, x_test, y_train, y_test)
Supervised Classification
The predictive properties of the models were analysed by classifying the crops based on soil and agrometeorological conditions and type of crop. Agriculture depends on the climate and soil conditions. Further, crops usually fall under the general categories such as cereals, pulses, millets, etc. Therefore, classification based on agrometeorological conditions and type of crop helps in recommending new and hybrid variety of crops to farmers. Various machine learning algorithms are used to classify the crops to determine the best model. From literature survey, six different algorithms have been considered for experimentation which were baseline model, decision tree, linear model, random forest, XGBoost and neural network model. Baseline model acts as a reference point for comparison with other models. They help in exact assessment of the properties of other machine learning methods. This model can be framed using regression error curves, mean and median of data, etc. (Whigham et al., 2015). Decision tree is a useful data mining and machine learning method for complicated data which can be alphabetical, numerical and nominal. A node is created based on the “information gain” approach determined by the attributes (Somvanhi et al., 2016). Linear model employs regressors from new and existing independent variables. This model, however, requires maintaining a balance between bias-variance and overfitting of data to obtain optimum results (Wilson & Sahinidis, 2017). XGBoost is one of the prominent data mining tools that incorporates features of many related techniques such as CPU multithreading for parallelism, processing of scant data as in decision tree and handling of huge amount of data at faster speeds as processed by block technology (Lu & Ma, 2020). Random forest is a method of “ensemble learning” based on decision tree model of machine learning. In this method, a predictor of random sample is used prior to segmentation of node; thereby decreasing the bias. The advantages of this model are introduction of two random elements, analysis of higher dimensions of data and quicker training periods (Lu & Ma, 2020). Artificial Neural Network (ANN), based on human brain’s biological neural processes, learns to recognize the patterns or relationships in the data by observing a large number of input and output examples through training.
The following tables depict the comparison of the classification models based on agrometeorological conditions and type of crop. To build a model choosing best algorithm is very important and to choose the algorithm here 5 different classification is built and best algorithm is chosen based on the minimal error value. The classification models is built for both individual crop and type of crop and the results are depicted in Figure 3 and 4 . The autoML function in python is used for anlayzing in which ensemble model will choose the better algorithm and build a new ensemble model but for this case the artificial neural network alone is chosen by the ensemble model which shows that ANN works better for t both the crop and crop type.The results show that the neural network model has a higher train time of 523.03 seconds and a logloss value of 0.00034 for agrometeorological conditions and 648.54 seconds and low logloss value of 0.00018 for type of crop with than other algorithms with the minimal logloss value. Therefore, it can be perceived that artificial neural network provides the best predictive model for crop classification. Optimizers vary the attributes of the neural networks to decrease losses due to errors. Optimizers adam, Nadam, SGD, adagrad, adadelta, Adamax and rmsprop were used. In addition, activation functions such as Relu, leaky Relu, PRelu, Elu were used for quicker training period of the network. Also, for both the classification processes, logloss metric has been used to validate the model.Among the models Relu and SGD provides the better results .
A learning curve displays an estimator's validation and testing score for various training samples. It is a tool to see how much more training data are open to us and if a variance error or a bias error is more common in the estimator.Here the learning curve of both individual crop and crop type is illustrated in figure 5 and 6 which shows that a better model is built.
A learning curve is a link between the success of an instructor in a job and the amount of attempts or the time taken to carry out a task.
Y=aXb (1)
Where:
Y is the average time over the measured duration
a represents the time to complete the task the first time
X represents the total amount of attempts completed
b represents the slope of the function
District Prediction
The prediction of districts is the novel approach in which considering all the parameters of the crop and the district pattern of crop .The presence of a crop in a districts is marked as 1 and other than as 0.The district labels are factorized into the feature vectors for which one hot encoder is used in python ,then an artificial neural network model is created to classify and predict the districts in which grid search method is used to search the hyperparameters to build a strong classification model and also districts are represented in the categorical format to get output for all 33 districts. The pseudocode in which neural network model is built is as follows
Pseudocode
Create an ANN for district recommendation
Input layer (Crop, Crop Type and Filtered features using Decision Tree for important features),
Hidden layer (3, to optimize weights) and
Output layer (Represents No. of Districts as categorical output)
X = [Crop, Crop_Type, Selected_Features]
Y = [District_1..District_33]
nn = sequential()
nn.add(input_layer)
nn.add(layer_1)
nn.add(layer_2)
nn.add(layer_3)
nn.add(output_layer)
nn.compile()
nn.fit(X,Y)
Use Grid search method to search for best hyperparameters to train the network
GridSearchObj = GridSearch(nn)
suggested_parameters = GridSearchObj.best_performance
Utilize the obtained parameters to build final network and predict
nn.compile(suggested_parameters)
nn.fit(X,Y)
district_recommendation = nn.predict(test)
Crop Recommendation System
The crop recommendation system which is depicted in figure 2 is an artificial neural network model in which it has the combination of individual crop ,crop type and districts . The neural network structure is also known as its 'architecture' or 'topology.' The number of layers consists of primary units. It also provides a weight change system for interconchanging. The selection of the structure influences the outcomes. It is the most important aspect of the neural network implementation. The predictive strength of the neural network increases by adding 1 or more hidden layers to the input and output layers and units in this layer. Yet as minimal as possible a variety of hidden layers. This means that the neural network does not store all learning knowledge, but can generalize it to prevent overfits. while building of these classification models the epoch loss is consistently getting reduced and the epoch accuracy is getting increased which is shown in the figure 7,8 and 9 and also