Materials collection
Most of the CHMs decoction pieces used in this study were commercial samples purchased from CHMs markets, a small number of samples were collected from the field. All of these materials were authenticated by Haibo Huang (Prof.) and Jiayun Tong (Ph.D.).
Dataset Construction
In previous studies on the CHMs identification, recognition models were generally trained using images with a single slice of Chinese herb on a clear background [12], or multiple slices heaped together on a cluttered background [16]. The latter often contained unrelated information that could seriously downgrade the prediction accuracy, even more, introduce bias. Actually, there are hundreds of commonly-used CHMs in the market, which mainly come from different parts of plants, with some from animals and minerals. And before they were applied in the disease treatment, all the CHMs have to go through different processing procedures which further diversify the characters (shapes, colors, and textures) within the same category, or assimilate characters of different categories. However, the quality and representativeness of the dataset play a decisive role in the AutoML model creation [29]. That it is, a rich and representative dataset is definitely needed to establish a high-performance CHMs identification model. Therefore, all the images used in this study were captured under a clear background with a single slice or non-overlapping multiple slices of the CHMs placed on light and untextured background without clutter. This enabled us to combine our professional knowledge to guide the algorithm to find more relevant features from the CHMs images, thus, to learn more like a human. To fully investigate the feasibility of this method, an image dataset with 315 categories of commonly-used CHMs (listed in Additional file 1: Table S1) was established, which contains many images of easily-confused CHMs. Different types of these easily-confused CHMs were summarized as follows:
-
Adulterants (Fig. 1a1-5): fake and genuine CHMs are often mixed in the market and hard to distinguish because of their highly similar characters, such as Ziziphi Spinosae Semen (Fig. 1a1-3) and its counterfeit Ziziphi Mauritianae Semen (Fig. 1a4-5).
-
CHMs with similar colors (Fig. 1b1-d5): some herbs are similar in colors, such as Scrophulariae Radix (Fig. 1d3) and Rehmanniae Glutinosae Radix (Fig. 1d1), which are easily confused because of their black color after going through similar processing procedures. Some CHMs from minerals or animals with indistinctive characters can also be very hard to distinguish (Fig. 1c1-5).
-
CHMs originated from closely related plants (Fig. 1e1-5): These plants are highly similar in morphology, the medicinal parts of each plant become even more difficult to identify after being chopped into slices, such as CHMs from the genus Ardisia (Fig. 1e1-5). The transverse sections of the different CHMs slices often show similar color in the bark or wood and have scattered or radial dots due to the secretory cavities.
-
CHMs applied in whole or aerial parts of the plants (Fig. 1f1-5): this type of CHMs often contains different parts of the plants—roots, stems, leaves, flowers, and fruits, which often mix and require careful examination.
-
CHMs small in size (Fig. 1g1-5): compared with other types of CHMs, this type of CHMs consists of seeds and fruits that are small or tiny in size (diameter < 5 mm).
-
CHMs using the same medicinal part — bark (Fig. 1h1-5): they often have similar appearance and texture: flat or curved in shape, the outer surface with or without scars, and the fracture surface granular or fibrous.
Inspired by Weng’s study [17], all the images used in this study were collected by our team with a smartphone camera. For some CHMs like small seeds or fruits (Fig. 1g1-5), a micro-lens was equipped on the top of the mobile phone camera lens to obtain high-resolution images. After eliminating low-quality and repetitive images, a total number of 31,460 CHMs images were collected for dataset construction, with about 100 images (varies from 94 to 105) in an exclusive folder for each category.
Dataset Pre-processing And Split
After the images were collected, the length-width ratio of each image was resized to 1:1. To reduce the training time of models, the resolution of each image was downscaled from 3024 × 3024 pixels or 3456 × 3456 pixels to 850 × 850 pixels.
The dataset was split into different subsets according to the hierarchical structure as shown in Fig. 2. Firstly, the original dataset was split into two subsets: one for the model building and another for the subsequent held-out test. Then, the model-building dataset was further split into a training set and an internal validation/testing set by AutoML platforms as described in the next subsection. The held-out dataset included 6260 images with a mixture of all categories, which were extracted from the original dataset and avoided exposure to the AutoML models earlier than the held-out test.
Building Image Recognition Models With Automl Platforms
Huawei ModelArts [30] provided by Huawei Cloud (Huawei, Shenzhen, China) was chosen to build our CHMs image recognition model. Its main features include:
-
ModelArts ExeML provides a customized, code-free development platform for beginners with no coding knowledge, and it can help users to build a customized, high-precision model quickly and flexibly.
-
Apply multiple pre-trained models and self-developed deep learning framework to build models that can achieve excellent performance using a small amount of data.
To build our CHMs image recognition models, the following steps were taken according to the website tutorials [31]:
-
Entered the ModelArts platform, created an OBS bucket and uploaded the images for the model-building to the bucket.
-
Created an image classification model and imported all the training data from the OBS bucket.
-
Set the parameters as default on the Model Configuration user interface (UI): Training Set Ratio (0.8) and Validation Set Ratio (0.2), Max Inference Time (300 milliseconds), Max Training Time (1 hour).
After the training job was submitted, ModelArts automatically searched for the best algorithm, neural architecture, and hyperparameters based on the training dataset.
For comparison, Baidu EasyDL [32] provided by Baidu Brain (Baidu, Beijing, China), a platform similar to Huawei ModelArts, which also provides a user-friendly interface and code-free development, was also chosen to build an image recognition model with the same training dataset.
To build our image recognition model with EasyDL, the following steps were taken for the data preparation and model configuration:
-
Entered the classic EasyDL platform and created an image classification model.
-
Created a new dataset for model building on the platform’s data center and uploaded the images in the form of a zip file.
-
On the Model Training UI:
-
Selected Public Cloud application programming interface (API) as deployment so that we can call the API to use the batch Services for the subsequent held-out test.
-
Selected the AutoDL Transfer as the training algorithm which is more suitable for fine classification scenarios like CHMs classification in this study.
-
Started the training and the dataset created for the model-building is automatically divided into a training set (70%) and a test set (30%) during the process.
Held-out Test
A subsequent held-out test was conducted with the held-out dataset after each model was built. The numbers of images used to create these two models and the held-out test were summarized in Table 1. The held-out dataset was created with 6260 images that were avoided exposure to the AutoML models during model-training. Due to the free calls limitation of the API provided by the EasyDL platform (1000 times), 945 pictures randomly extracted from the same held-out dataset were used to evaluate the EasyDL model. The schematic representation of the AutoML model creation is depicted in Fig. 3. On the ModelArts platform, the test was carried out in a batch mode by deploying the model and using the online batch prediction services. On the EasyDL platform, the test was carried out by deploying the model to the Public Cloud, publishing the model as API, and calling the API to use batch prediction services.
Table 1
Detailed information on model building and held-out test
Model
|
Number of Categories
|
Number of Images (Model Building)
|
Training Set Ratio
|
Validation Set Ratio
|
Number of Images (Held-out test)
|
ModelArts
|
315
|
25200
|
0.8
|
0.2
|
6260
|
EasyDL
|
315
|
25200
|
0.7
|
0.3
|
945
|
Classification Performance Measures
The goal of a machine learning algorithm is to learn from training data and predict class labels for testing data. Therefore, the assessment method is a key factor in evaluating the classification performance and guiding classifier modeling. In this study, four measures, namely the accuracy, the precision, the recall, and the F1-score were chosen to evaluate the models’ performance. For overall model evaluation, the accuracy is one of the most commonly used measures for evaluating classification performance, and it is defined as a ratio between the correctly classified samples number to the total number of samples. For each category, the precision represents the proportion of positive samples that were correctly classified to the total number of positive predicted samples; the recall represents the positive correctly classified samples to the total number of positive samples; the F1-score represents the harmonic mean of the precision and the recall. The F1-score value ranges from zero to one, and high values of the F1-score indicate high classification performance [33, 34].
Manual Prediction
To investigate the similarities and differences between the AutoML model recognition and manual identification, three professionals were invited to identify each CHM category using the same images from the held-out dataset.