2.1 Materials
The bamboo rings were supplied by International Center for Bamboo and Rattan, including 45 bamboo species and the detailed information of the bamboo species is listed in Table 1. The bamboo rings with a length of 2 cm were cut from the middle part of the bamboos, then these rings were polished with sandpaper of 320 mesh to expose the vascular bundles and parenchyma for clear observation. The cross-section of the bamboo rings was scanned by a high-resolution scanner (EPSON PERFECTION V850 PRO) in 16 Gy modes with a resolution of 9600 ppi. The images of the cross-section of bamboos obtained from the scanner were used for training and testing the models in the following section.
Table1 Bamboo species used for classification
Bamboo species
|
Bamboo species
|
Phyllostachys vivax McClure
|
Phyllostachys glauca McClure
|
Phyllostachys aurea Riviere & C. Rivière
|
B.multiplex var.shimadai(Hayata)Sasaki
|
Phyllostachys kwangsiensis W. Y. Hsiung, Q. H. Dai & J. K. Liu
|
Bambusa rigida Keng & P. C. Keng
|
Phyllostachys sulphurea var. viridis R. A. Young
|
Phyllostachys nidularia Munro
|
Bambusa vulgaris 'Wamin'McClure
|
Bambusa chungii McClure
|
Dendrocalamopsis beecheyana(Munro)Keng var.pubescens(P. F. Li) Keng f.
|
Phyllostachys nigra (Lodd. ex Lindl.) Munro
|
Thyrsostachys oliveri Gamble
|
Phyllostachys iridescens C. Y. Yao & S. Y. Chen
|
Bambusa eutuldoides McClure
|
Phyllostachys incarnata T. W. Wen
|
Phyllostachys parvifolia C. D. Chu & H. Y. Chou
|
Bambusa oldhamii Munro
|
Phyllostachys nigella T. W. Wen
|
Dendrocalamus minor (McClure) Chia et H. L. Fung var. amoenus (Q. H. Dai et C. F. Huang) Hsueh et D. Z. Li
|
Phyllostachys bambusoides Sieb. et Zucc. f. shouzhu Yi
|
Phyllostachys glabrata S. Y. Chen & C. Y. Yao
|
Bambusa multiplex (Lour.) Raeusch. ex Schult. 'Alphonse-Kar' R. A. Young
|
Bambusa longispiculata Gamble ex Brandis
|
Bambusa emeiensis L. C. Chia & H. L. Fung
|
Phyllostachys sulphurea (Carrière) Riviere & C. Rivière
|
Bambusa pervariabilis McClure
|
Bambusa eutuldoides McClure var. viridivittata (W. T. Lin) L. C. Chia
|
Phyllostachys bambusoides Sieb. et Zucc. f. lacrima-deae Keng f. et Wen
|
Bambusa textilis McClure
|
Phyllostachys propinqua McClure
|
Bambusa tulda Roxb
|
Phyllostachys reticulata (Ruprecht) K. Koch
|
Phyllostachys prominens W. Y. Xiong
|
Phyllostachys meyeri McClure
|
Bambusa gibboides W. T. Lin
|
Phyllostachys edulis (Carr)H. de Lebaie
|
Dendrocalamus latiflorus Munro
|
Phyllostachys nigra var. henonis (Mitford) Stapf ex Rendle
|
Phyllostachys aureosulcata McClure
|
Phyllostachys heteroclada Oliv.
|
Bambusa vulgaris Schrader ex Wendland'Vittata'
|
Schizostachyum funghomii McClure
|
Bambusa tuldoides cv.Swolleninternode
|
Thyrsostachys siamensis Gamble
|
|
2.2 Experimental Dataset
The learning of deep neural networks is often driven by data25, so getting abundant bamboo ring cross-section images is our primary task. Meanwhile, the scanned images we obtained were too high and too wide to input into the defined models (18,000 × 18,000 pixel, Fig. 1(a)), so the procedure of cutting the original images into small size is essential. In the traditional models of training, the way is mostly manual to treat them to generate the dataset26-28. However, such operation is often time-consuming and laborious, so in this paper the morphological method in OpenCV we chose to obtain 512×512 training and testing samples from the original large sample images.
The detailed procedures were described as follows: firstly29-31, the morphological function is used to search the contours on the bamboo ring images and the minimum boundingRect32-34was found, which is shown in Fig. 1(b).
In the second step of sampling, according to the total number of samples in each species, the number of samples of each images was calculated to balance the number of the samples. For example, each species need 6000 samples for training and testing, 20 original images of species A and 30 original images of species B were not sufficient, so 300 samples of species A and 200 samples of species B were collected after careful calculations. This study applied two sampling approaches to obtain the samples: step-size sampling and random sampling. The step-size sampling approach is the box moved by 512 step-size (“stride_w=512” and “stride_h=512”) from the upper left corner of the boundingRect (Fig. 1(c)(d)), but it is difficult to obtain enough samples by step-size sampling. Therefore, the random sampling approach was born at the right moment which randomly selected boxes in the boundingRect of the bamboo ring cross-section images. These two approaches are complementary to each other and then the low-quality samples were deleted.
For low-quality samples, the interspecific mean pixel constraint and the interspecific pixel variance constraint were used to filter them, when the samples’ average pixel and pixel variance are below the average of all, it is not eligible to be part of the samples. In addition, an artificial approach is carried out on the generated samples to further obtain high-quality samples data, and about 10% of the low-quality samples are deleted.
At length, 1000 images of each bamboo species were collected, of which 800 were randomly selected for each bamboo species as the training set, 100 as the verification set, and 100 as the test set. The specific dataset is shown in Fig. 2.
2.3 Deep learning models
Contrary to traditional machine learning in which features are marked manually, deep learning models automatically extract higher-level features from dataset. In fact, since 2012 CNN(convolutional neural network) models have won exclusively the prestigious ILSVRC competition35. What's more, CNNs have achieved outstanding accuracies in a plethora of contemporary applications, automatizing its design36. For image recognition and classification, CNNs have achieved the state-of-the art accuracies using different variations of models. In convolutional neural network37-40, generally, the input is a series of images, and the permanent weights W are the filters, the convolution layer alternates with the pooling layer, and finally fully connected layer, each neuron is a fully connected to the previous layer. The layer works like a conventional perceptron, it combines all the input to create the output categories. Better models can be obtained by adjusting the structures of the neural network and the distribution of parameters. In recent years, these structures have evolved by increasing the depth, the width of the networks and decreasing the parameters from the lower layers into the higher layers. Three usual CNNs, ResNet, Inception v3 and EfficientNet were considered, the structure of a typical CNN is shown in Figure. 3.
2.3.1 ResNet
ResNet41, which won the ILSVRC classification task in 2015, address the problem of the vanishing or exploding gradients while increasing the network’s depth to obtain a better accuracy. Several connections between layers are added to fit a residual mapping and these new connections skip various layers and perform an identity, which not adds any new parameters. This network, called a building block, will be repeated in the whole structure.
2.3.2 Inception-V3
GoogleNet42 won the ILSVRC in 2014 and it is based on the repetition of a module called inception. Four convolutions use a 1×1 kernel to increase the width and the depth of the network and to decrease the dimensionality, and 1 × 1 convolutions are performed before the other two convolutions in the module, a 3×3 and a 5×5. Inception-V343 can be considered as a modification of GoogleNet. The building block is changed by removing the 5×5 convolution and introducing two 3×3 convolutions. The resulting network is made up of 10 inception building block. In addition, the base block is modified as the network goes deeper. Five blocks are changed by replacing the n×n convolutions by a 1×7 followed by a 7×1 convolution in order to reduce the computational cost. The last two blocks replace the last two 3×3 convolutions by a 1×3 and a 3×1 convolutions in parallel, what’s more, the first 7×7 convolution in GoogLeNet is also changed by three 3×3 convolutions. In total, Inception-V3 model proposes the network can reduce the number of parameters and extract high-dimensional features under the premise of ensuring the quality of the model44.
2.3.3 EfficientNet
The main contribution of EfficientNet is to design a standardized convolutional network extension method, which not only achieves high accuracy but also greatly saves computing resources. That is to balance the three dimensions of resolution, depth, and width to design a better model structure45.In terms of EfficientNet, conventional practice often choose k(3,3), k(5,5), or k(7,7) kernels46. Larger kernels can potentially improve model accuracy and efficiency and help capture high-resolution patterns, while small kernels can extract better complex features from low-resolution patterns.
2.4 Experimental Design
Several common deep learning models, such as ResNet, InceptionV3, and EfficientNet, are considered as our training and testing models. In addition, the key hyperparameters, optimizers, and image enhancement operations are kept unchanged, to more scientifically evaluate the performance level of different models for bamboo species classification, the specific configuration is shown in Table 2 below.
Table 2 Hyperparameter and optimizer configuration
parameter
|
parameter meaning
|
parameter action
|
Batch_size = 8
|
sample size for each iteration
|
The larger batch size is, the more conducive it is to find the gradient direction with the fastest decrease of loss function; however, too large batch_size will consume too much GPU memory to train.
|
Epochs = 100
|
number of training
|
A sufficient number of training is helpful to find the global or local minimum of the loss function.
|
Adam47
|
optimizer
|
The adaptive learning rate optimizer, according to the changes in the parameters learned in the past, adapting to the changes of next learning rate, adjusting the gradient descent direction.
|