4.1 Image background segmentation
Before feature extraction(FE) of 200 images, considering that there are too many background factors of the photographed images, such as walls, lights, etc., it is necessary to segment the background of the images and only retain the main content. The background of rose image is segmented using the method shown in Fig. 8.
The background of the image is segmented by removebg algorithm and grabcut algorithm. The rose image after background segmentation is shown in Fig. 9.
The 3 figures in Fig. 10 are the pixel histograms of the original image, the background removed image and the green leaf removed image respectively. The pixel value of the original image is evenly distributed between 0-255. When extracting features, it can not distinguish the background pixel value from the main pixel value of the image; From the pixel histogram of rose image with background removed and green leaves removed, it can be seen that most of the black pixel values are different from other pixel values. Therefore, in FE, the image after background segmentation can better express the main content of the image.
4.2 Image feature extraction
After image background segmentation and removing the irrelevant factors of rose image, this paper extracts the features from 3 aspects: color feature, geometric feature and texture feature.
4.2.1 Color feature extraction
In the process of image processing, RGB model is the most common and important color space model. Based on the RGB model of color image, this paper extracts the color features of yellow rose. Color image is composed of 3 color channel vectors: red, green and blue. To form any color, it must be imaged by superimposing and mixing 3 color components. In the spatial model, if the proportion of the 3 colors is different, the color will be different. The parameters of red channel, green channel and blue channel range from 0 to 255. The gray level of the 3 channels is 256, and the 3 components of R, G and B are highly correlated. The proportion of the 3 color components cannot be observed directly by human eyes,where (255, 0, 0) is red, (0, 255, 0) is green, (0, 0, 255) is blue, (255, 255, 255) is white, and (0, 0, 0) is black.
In the RGB color model, among the 3 primary color components of yellow, red and green account for a large proportion and contain more color information. When classifying the yellow roses, the difference between them will be more significant. Therefore, the mean and variance statistics of the red channel and the mean and variance statistics of the green channel of the rose image are selected as the color features. In order to accurately calculate the statistics, use Python's PIL library to separate the red channel image and green channel image from the color image, extract the image pixel value and generate the list format, cycle the list pixel value, assign the black (0, 0, 0) in the image as NAN, and finally calculate the mean value statistics and variance statistics.
The box line diagram in Fig. 15 shows the basic information of each statistic of 5 levels of yellow roses, and the red line shows the sample mean value of each level, red_mean means the mean value of the red channel, red_std means the variance of the red channel, green_mean means the mean value of the green channel, green_std means the variance of the green channel. Obviously, when classifying yellow roses according to color characteristics, there is no significant difference in the mean value between the first 4 levels. In the neural network model, the first level to the fourth level can not be recognized and classified accurately, there are obvious differences between the first 4 levels and the fifth level, so when the extracted color feature data is sent to the neural network, it will have a high recognition rate.
4.2.2 Geometric feature extraction
Observing the flower image and flower stem image of yellow rose, the area of the main part of the image(including flowers, flower stems and green leaves), the length of the flower stem and the detection of the edge can directly reflect the geometric characteristics of the image. Therefore, this paper examines the geometric characteristics of the image from 3 aspects: area statistics, length statistics and edge detection statistics.
In terms of area, for the convenience of calculation, assuming that all images are unit area, the proportion of flower and leaf area in the image and the proportion of flower stem and leaf area in the image are extracted for the image with leaves after background segmentation. The specific algorithm is to traverse all images by using PIL library, store the RGB pixel values of each image in the form of list, and calculate the list length, use the for loop statement to access channel R, channel G and channel B in turn, accumulate the number of elements whose pixel values of the 3 channels in the list are not all zero, and finally calculate the proportion of the accumulated value in the length of the list, so as to calculate the area statistics expected in this paper. In terms of length, when collecting image data, this paper has manually marked the flower stem length and stored it in Excel to generate the characteristic data of flower stem length. In the aspect of edge detection, the PIL library is also used to detect the edge of the image by using the parameter ImageFilter.FIND_EDGES.
The specific process of calculating the edge perimeter of the rose is roughly the same as the algorithm of the area, and the geometric feature data can be obtained after.
Figure 17 is the box line diagram of geometric characteristics of yellow rose data, where rose_area represents the proportion of flower and leaf area, rose_pole_area represents the proportion of flower stem and leaf area, rose_pole_length is the length of the flower stem, rose_edge represents the edge detection of flowers and leaves. When the yellow roses are classified according to the geometric features, in the images of flower area, flower stem length and edge detection, the red line from the first level to the fourth level shows a downward trend, which shows that there are significant differences between the roses from the first level to the fourth level. Therefore, when the extracted geometric feature data is sent to the neural network, there will be a large recognition rate, it can classify roses of different levels more accurately.
4.2.3 Texture feature extraction
In this paper, the texture features of yellow roses are extracted based on gray level co-occurrence matrix (GLCM). GLCM is generally used to measure the pixel value of gray-scale image. If the pixel value of the gray-scale image changes within a certain small threshold, that is, the variance of the gray-scale value is small, the texture change of the image is relatively slow; On the contrary, the texture of the image changes greatly. In this paper, the GLCM is calculated by using the function greycomatrix()in the library skimage.feature in Python, and the statistics of energy, contrast, dissimilarity and homogeneity are calculated by using the function greycomatrix() to extract the texture features of rose flowers.
(1)\({\text{Energy }}=\sqrt {\sum\limits_{i} {\sum\limits_{j} p } {{(i,j)}^2}}\) (2)\({\text{Contrast }}=\sum\limits_{i} {\sum\limits_{j} {{{(i - j)}^2}} } p(i,j)\)
(3)\({\text{ Dissimilarity }}=\sum\limits_{i} {\sum\limits_{j} | } i - j|p(i,j)\) (4)\({\text{ Homogeneity }}=\sum\limits_{i} {\sum\limits_{j} {\frac{1}{{1+{{(i - j)}^2}}}} } p(i,j)\)
Where, p(i,j) represents the elements in row i and column j of the GLCM. Energy is used to measure the stability of image texture changes. Contrast and dissimilarity, the former is to apply exponential growth to matrix elements, while the latter is to apply offline growth to matrix elements. Both of them are used to measure the change of image brightness. Homogeneity reduces exponentially on the matrix elements to measure whether the image texture changes evenly. According to the above rule changes, this paper first uses the parameter cv2.IMREAD_GRAYSCALE in library openCV to batch process the rose image and uniformly convert it into gray image; Then, input the gray image in the function greycomatrix(), the gray level is 256, select the distance as 1, scan to the right in the horizontal direction, and output the symmetrical GLCM; Finally, input the GLCM into the function greycoprops(), and select 4 feature scalars of energy, contrast, dissimilarity and homology respectively to output the texture feature data.
Figure 18 is the box line diagram of texture features of yellow roses. Obviously, when classifying yellow roses according to texture features, in the energy diagram and the homogeneity diagram, the red lines from the first level to the fourth level show an upward trend, and the red lines from the fourth level to the fifth level show a downward trend; In the contrast diagram, the red line shows a uniform downward trend as a whole; In the dissimilarity diagram, the red line from the first level to the fourth level shows a downward trend, and the red line from the fourth level to the fifth level shows an upward trend. Obviously, there are significant differences between the yellow roses from the first level to the fifth level. Therefore, when the extracted texture feature data is sent to the neural network, it will have a high recognition rate.
4.3 Evaluation of characteristic variables based on factor analysis
In Section 4.2, 12 image features are extracted, and each feature extracted is assumed to correspond to a variable. The specific variable names are shown in Table 4.1.
Table 4.1
Feature information extracted from yellow rose
Symbol
|
Variable
|
Symbol
|
Variable
|
red_mean
|
Mean value of red channel
|
rose_pole_length
|
Stem length
|
red_std
|
Variance of red channel
|
rose_edge
|
Edge detection of flowers and leaves
|
green_mean
|
Mean value of green channel
|
energy
|
Energy
|
green_std
|
Variance of green channel
|
contrast
|
Contrast
|
rose_area
|
Proportion of flower and leaf area
|
dissimilarity
|
Dissimilarity
|
rose_pole_area
|
Proportion of flower stem and leaf area
|
homogeneity
|
Homogeneity
|
After the feature extraction of yellow rose, the feature data containing 12 variables are obtained. In this paper, the parameters of 12 variables of 1600 samples in the training set are evaluated by factor analysis. Before factor analysis, KMO test and Bartlett spherical test should be conducted for 12 variables. The test results are shown in Table 4.2.
Table 4.2
KMO and Bartlett sphericity test
KMO
|
0.923
|
Bartlett sphericity test
|
\({\chi ^2}\)statistics
|
5601.237
|
Significance
|
0.000
|
It can be seen from Table 4.2 that the KMO value is 0.923 and the p value of Bartlett sphericity test is 0.000, which indicates that there is a strong correlation between the 12 variables and is suitable for factor analysis.
Based on the factor analysis of 12 variable data of yellow rose by SPSS software, the total variance interpretation table of factor analysis is obtained, as shown in Table 4.3. Generally, when the cumulative variance contribution rate of factor analysis reaches 75.00%, it is considered that the corresponding factor has a good ability to explain the original information. The cumulative variance contribution rate of the first 3 factors reaches 78.602%, that is, the first 3 factors explain 78.602% of the original information, which shows that the first 3 factors can better explain the information of 12 variables.
Table 4.3
Interpretation of total variance
Factor
|
Initial eigenvalue
|
Extract the sum of squares of loads
|
Total
|
Percentage variance(%)
|
Accumulation
( %)
|
Total
|
Percentage variance(%)
|
Accumulation
( %)
|
1
|
10.588
|
43.014
|
43.014
|
10.588
|
43.014
|
43.014
|
2
|
5.674
|
23.051
|
66.065
|
5.674
|
23.051
|
66.065
|
3
|
3.086
|
12.537
|
78.602
|
3.086
|
12.537
|
78.602
|
4
|
1.063
|
4.319
|
82.921
|
—
|
—
|
—
|
5
|
0.937
|
3.807
|
86.728
|
—
|
—
|
—
|
6
|
0.728
|
2.958
|
89.685
|
—
|
—
|
—
|
7
|
0.614
|
2.494
|
92.180
|
—
|
—
|
—
|
8
|
0.56
|
2.275
|
94.455
|
—
|
—
|
—
|
9
|
0.363
|
1.475
|
95.929
|
—
|
—
|
—
|
10
|
0.355
|
1.442
|
97.372
|
—
|
—
|
—
|
11
|
0.335
|
1.361
|
98.732
|
—
|
—
|
—
|
12
|
0.312
|
1.268
|
100.000
|
—
|
—
|
—
|
Note: — means no data. |
The coefficient matrix and score coefficient matrix of the first 3 factors and 12 variables are shown in Table 4.4.
Table 4.4
Coefficient matrix and score coefficient matrix of the first 3 factors
Variable
|
Factor1
|
Factor2
|
Factor3
|
coefficient
|
score coefficient
|
coefficient
|
score coefficient
|
coefficient
|
score coefficient
|
red_mean
|
0.536
|
0.078
|
0.317
|
0.034
|
0.593
|
0.079
|
red_std
|
0.307
|
0.045
|
0.227
|
0.026
|
0.675
|
0.086
|
green_mean
|
0.580
|
0.085
|
0.402
|
0.039
|
0.710
|
0.092
|
green_std
|
0.428
|
0.063
|
0.201
|
0.024
|
-0.607
|
-0.078
|
rose_area
|
0.920
|
0.135
|
0.491
|
0.069
|
0.117
|
0.02
|
rose_pole_area
|
0.813
|
0.119
|
0.249
|
0.030
|
0.104
|
0.018
|
rose_pole_length
|
0.761
|
0.111
|
-0.227
|
-0.027
|
0.215
|
0.025
|
rose_edge
|
0.92
|
0.135
|
0.394
|
0.041
|
0.100
|
0.015
|
energy
|
-0.919
|
-0.134
|
-0.095
|
-0.010
|
0.294
|
0.035
|
contrast
|
0.746
|
0.109
|
0.139
|
0.022
|
0.381
|
0.037
|
dissimilarity
|
0.875
|
0.128
|
0.128
|
0.021
|
-0.229
|
-0.028
|
homogeneity
|
-0.913
|
-0.134
|
-0.073
|
-0.008
|
0.383
|
0.039
|
If the 3 selected factors are recorded as , the relationship between the 3 factors and the 12 variables is as follows:
It can be seen from Table 4.4 that the information of factor1 mainly comes from the variables energy, contrast, dissimilarity and homogeneity, which will be called image texture feature factor; The information of factor2 mainly comes from the variable rose_ area, rose_ pole_ area, rose_ pole_ length, rose_ edge, it will be called image geometric feature factor; The information of factor3 mainly comes from the variable red_ mean, red_ std, green_ mean, green_ std, it will be called image color feature factor. Obviously, after using factor analysis to evaluate 12 variables, the absolute values of texture features and geometric features in component coefficient and score coefficient are greater than those of color features, that is, the contribution rate of texture features and geometric features is greater than that of color features. This result is basically consistent with the analysis in Section 4.2.
4.4 Rose classification algorithm based on FE and ANN
In order to solve the over fitting problem in Chap. 3 and improve the recognition rate, this paper sends the 3 factor data of 1200 pictures in the training set obtained in Section 4.3 into the ANN for recognition and classification. The flow chart of rose classification algorithm based on FE and ANN is designed by using tensorflow2.0, as shown in Fig. 19.
The ANN based on tensorflow2.0 is a 3-layer network. The number of neurons in the first layer is 3, the number of neurons in the second layer is 10 and the number of neurons in the third layer is 5; In the ANN, the activation function relu is used and the classifier softmax is used for output; In the function compile() configuration training method, select optimizer adam and loss function sparse_categorical_ crossentropy, select evaluation index sparse_categorical_accuracy; In the built ANN model, 20 characteristic data are sent in each time, and they are iterated for 20 times, 40 times, 60 times, 80 times, 100 times and 120 times respectively. The parameter statistical table is shown in Table 4.5.
Table 4.5
Parameter statistics of rose classification algorithm based on FE and ANN
Layer
|
Output shape
|
Params
|
the first layer
|
multiple
|
520
|
the second layer
|
multiple
|
820
|
the third layer
|
multiple
|
105
|
Total params:1445
|
The recognition rate on the training set and test set is shown in Fig. 20. It can be seen from Fig. 20 that when the number of iterations is 20, 40, 60, 80, 100 and 120 respectively, the recognition rate of rose classification algorithm based on FE and ANN increases with the increase of the number of iterations, and the recognition rate on both training set and test set exceeds 90%. The recognition rate of test set is very close to that of training set.
In order to further improve the classification recognition rate, it is found through experiments, choosing the appropriate number of iterations is particularly important for rose classification algorithm based on FE and ANN. Finally, when the number of iterations is 160, the highest classification recognition rate on the training set and test set is obtained (as shown in Fig. 21). The first figure in Fig. 21 shows the loss rate on the training set and test set under the rose classification algorithm based on FE and ANN, and the second figure shows the recognition rate on the training set and test set. Obviously, both the loss rate and the recognition rate, the results of the training set and the test set are very close. The recognition rate in the training set is mostly more than 95%, and the highest is 97.81%, and the recognition rate in the test set is mostly more than 94%, and the highest is 96.61%, which is significantly improved compared with the classification algorithm based on ANN and CNN.
4.5 Comparison of 3 algorithms
In this paper, the rose classification algorithm based on ANN is named algorithm 1, the rose classification algorithm based on CNN is named algorithm 2, and the rose classification algorithm based on FE and ANN is named algorithm 3. In this paper, 100 training recognition is carried out under 3 algorithms, and the recognition rate tends to be stable is selected, and the average recognition rate of each training and the final average recognition rate of 100 training are obtained by mean processing.
Under algorithm 1, algorithm 2 and algorithm 3, the rose image is trained and recognized 100 times respectively, and the average recognition rates of the 3 algorithms are obtained each time, as shown in Table 4.6 (only the recognition rates of the first 15 trainings are shown).
Table 4.6
Recognition rate of 3 algorithms
Training times
|
Average recognition rate of training set
|
Average recognition rate of test set
|
Algorithm 1
|
Algorithm 2
|
Algorithm 3
|
Algorithm 1
|
Algorithm 2
|
Algorithm 3
|
1
|
0.8553
|
0.9471
|
0.9883
|
0.6556
|
0.8594
|
0.9479
|
2
|
0.8616
|
0.9402
|
0.9719
|
0.6973
|
0.8473
|
0.9470
|
3
|
0.8607
|
0.9158
|
0.9730
|
0.7068
|
0.8183
|
0.9487
|
4
|
0.8543
|
0.9473
|
0.9673
|
0.6678
|
0.8592
|
0.9475
|
5
|
0.8636
|
0.9198
|
0.9792
|
0.7057
|
0.8092
|
0.9614
|
6
|
0.8580
|
0.9507
|
0.9686
|
0.6490
|
0.8815
|
0.9492
|
7
|
0.8585
|
0.9478
|
0.9843
|
0.6698
|
0.8379
|
0.9500
|
8
|
0.8468
|
0.9168
|
0.9819
|
0.7010
|
0.8110
|
0.9500
|
9
|
0.8571
|
0.9137
|
0.9712
|
0.6765
|
0.7830
|
0.9449
|
10
|
0.7789
|
0.9098
|
0.9626
|
0.6185
|
0.7837
|
0.9382
|
11
|
0.8561
|
0.9406
|
0.9799
|
0.6826
|
0.8110
|
0.9496
|
12
|
0.8609
|
0.9573
|
0.9807
|
0.6962
|
0.8620
|
0.9648
|
13
|
0.8521
|
0.9338
|
0.9692
|
0.7076
|
0.8943
|
0.9504
|
14
|
0.8596
|
0.9363
|
0.9807
|
0.7035
|
0.8118
|
0.9394
|
15
|
0.8564
|
0.9321
|
0.9818
|
0.6682
|
0.8402
|
0.9424
|
…
|
…
|
…
|
…
|
…
|
…
|
…
|
Under algorithm 1, algorithm 2 and algorithm 3, draw the average recognition rate curve on each training set and test set in 100 times, as shown in Fig. 22. As can be seen from Fig. 22, the average recognition rate of each training set of algorithm1 is mostly 80%-90%, and the average recognition rate of the test set is 60%-80%; The average recognition rate of each training set of algorithm 2 is mostly 90%-96%, and the average recognition rate of test set is 75%-90%; The average recognition rate of each training set of algorithm 3 is mostly 95%-99%, and the average recognition rate of test set is 93%-97%. It can be seen that the recognition effect of algorithm 3 is significantly better than algorithm 1 and algorithm 2.
In order to test the recognition accuracy and stability of 3 algorithms, the average recognition rate and standard deviation of 100 training recognition are calculated under algorithm 1, algorithm 2 and algorithm 3 as shown in Table 4.7.
Table 4.7
Average recognition rate and standard deviation of 3 algorithms
Training times
|
Training set
|
Test set
|
Average recognition rate
|
Standard deviation
|
Average recognition rate
|
Standard deviation
|
Algorithm 1
|
85.46%
|
0.0163
|
67.72%
|
0.0316
|
Algorithm 2
|
93.14%
|
0.0119
|
83.21%
|
0.0234
|
Algorithm 3
|
97.51%
|
0.0074
|
94.45%
|
0.0078
|
It is obvious from Table 4.7 that among the 3 algorithms, whether in the training set or the test set, the standard deviation of recognition rate of algorithm 1 is the largest and the standard deviation of recognition rate of algorithm 3 is the smallest, which shows that from the perspective of recognition effect stability, the recognition stability of algorithm 1 is the worst and that of algorithm 3 is the best. The average recognition rate of the test set under algorithm 1 is only 67.72%, which is the lowest. The average recognition rate of test set under algorithm 3 is 94.45%, which is highest. This shows that the rose classification algorithm based on FE and ANN proposed in this paper not only speeds up the operation speed of the algorithm, but also significantly improves the accuracy of rose classification.