An Easy Method for Identifying 315 Categories of Commonly-Used Chinese Herbal Medicines Based on Automated Image Recognition Using AutoML Platforms

Background: The identication and authentication of Chinese herbal medicines (CHMs) are directly related to their safety and ecacy in clinical treatment. However, the limited number of qualied professionals with expertise fails to meet the demand of the vast CHMs market. To make the CHMs identication more convenient and accurate, this study aimed at assessing the feasibility of the state-of-art automated machine learning (AutoML) technology in CHMs image recognition. Methods: This study presented an experimental AutoML model built on the one-stop Huawei ModelArts platform instead of a handcrafted neural network. A rich and representative dataset of 31,460 images consisting of 315 categories of commonly-used CHMs was built and used for the model creation. Furthermore, the Huawei ModelArts model was compared with a model built on the Baidu EasyDL platform using the same dataset to investigate their ability to recognize CHMs images. Three professionals were also invited to recognize images of 315 categories of CHMs. Results: During the model evaluation, high accuracies of 99.2% and 98.4% were achieved by ModelArts and EasyDL, respectively. In the subsequent held-out tests, the accuracies of ModelArts and EasyDL models were 91.2% and 91.85%, respectively. Both models performed very well individually and no statistically signicant difference was found in model performance between these two platforms. However, the model-training time was only approximately 41 minutes on ModelArts platform but 118 minutes on EasyDL. The mean accuracy of the manual recognition for 315 CHMs was 97.46±1.58%. Conclusion: Results revealed that AutoML technology is a fast and simple approach and has great practical potential in the eld of CHMs image recognition. Since the Huawei ModelArts platform requires less training time, we recommend it as a priority.


Background
Chinese herbal medicines (CHMs) are important materials in the prevention and treatment of disorders or diseases in traditional Chinese medicine. Most CHMs originate from natural or cultured plants, with some from cultured animals and minerals, and the popularity of CHMs increases day by day globally [1,2]. For most consumers with no professional identi cation knowledge, it is almost impossible to identify hundreds of different CHMs even commonly used in the treatment of diseases. This may create opportunities for some retailers to maximize pro ts by adding fake, non-o cinal herb parts, or inferior plant materials [3]. Moreover, some CHMs are easily-confused even for professionals because of their similarities in terms of shapes, textures, and colors [4]. However, the identi cation and authentication of Chinese herbs are directly related to their safety, quality, and e cacy in clinical medication [5]. Adulterations and misidenti cation of CHMs can cause adverse effects or even fatality [6]. Therefore, the accurate identi cation of CHMs is crucial to regulating the chaotic CHMs market and controlling the quality of CHMs.
The identi cation methods of CHMs mainly include macroscopic identi cation, microscopic identi cation, physicochemical identi cation, and biological identi cation. The latter three methods often require sophisticated equipment only available in laboratories [7]. As a traditional method to identify CHMs, empirical macroscopic identi cation has been proved to effectively re ect the authenticity and quality of CHMs in thousands of years of practice and is widely adopted by the community pharmacies due to its feasibility [8]. Macroscopic identi cation is based on appearance characters such as shape, size, color, texture, smell, and taste. This method heavily relies on the knowledge of human professionals which is somehow subjective and varies among individuals. Thus, how to make the CHMs identi cation more feasible, objective, and accessible attracts more and more attention from practitioners.
Previous studies on CHMs identi cation based on image recognition were reported, and they mainly focused on building the CHMs image recognition model with handcrafted algorithms using low-level image features such as shape [9], color [10,11], and texture [12]. However, these features are easily affected by image appearance and often require high-resolution images captured by sophisticated camera devices. With the rapid development of computer science, machine learning has become an excellent approach to perform image recognition tasks in a wide variety of elds such as clinical medicine [13], molecular biology [14], or plant recognition [15]. This cutting-edge technology has also been used on automated CHMs identi cation and has shown promising application prospects: CHMs identi cation Models built with machine learning algorithms such as convolutional neural networks and transfer learning can automatically learn representative features that may even be ignored by humans from a large amount of image data through operations such as convolution and pooling. These models have been proved to be robust and can achieve high precision [16][17][18].
While machine learning has shown great potential in CHMs identi cation, building an effective classi cation model remains a burdensome task. The model creation often requires immense resources: high capacity memory, strong graphics processing units, the help of professionals during the development, and long training time [19]. Even though pre-trained neural architectures such as AlexNet [20], VGGNet [21], and ResNet [22] can reduce the amount of labeled data needed to train a model, data labeling and model ne-tuning can still be very laborious and time-consuming, which greatly impedes the development of machine learning models in both academics and industry. To overcome these challenges, automated machine learning (AutoML) has emerged as a new sub-area in machine learning. With automated model selection [23], neural architecture search [24], and feature engineering [25], AutoML not only simpli es the creation and application of machine learning models but also greatly reduces the turnaround time and improves the accuracy of the customized models by removing human errors [26].
Nowadays, there are some products of AutoML available in the industry, which provide end-to-end AutoML pipelines to reduce user intervention during model development: the user simply provides data, and the AutoML system automatically determines the approach performed best for this particular application [23]. This technology has been shown to produce encouraging results in various studies with only small numbers of images [27,28]. With the help of AutoML, researchers can focus more on solving problems with more application and business value.
In this study, to make CHMs identi cation more convenient and accurate, experimental machine learning models were created and evaluated using state-of-art AutoML platforms available on the internet instead of handcrafted machine learning algorithms. Given the subsequent applications potential, the handy device-smartphone was used to capture images for AutoML model creation. Thus, this work will greatly lower the barrier of CHMs identi cation to allow ordinary people to identify different CHMs in their daily life, and it possesses great potential in commercial applications.

Materials collection
Most of the CHMs decoction pieces used in this study were commercial samples purchased from CHMs markets, a small number of samples were collected from the eld. All of these materials were authenticated by Haibo Huang (Prof.) and Jiayun Tong (Ph.D.).

Dataset Construction
In previous studies on the CHMs identi cation, recognition models were generally trained using images with a single slice of Chinese herb on a clear background [12], or multiple slices heaped together on a cluttered background [16]. The latter often contained unrelated information that could seriously downgrade the prediction accuracy, even more, introduce bias. Actually, there are hundreds of commonlyused CHMs in the market, which mainly come from different parts of plants, with some from animals and minerals. And before they were applied in the disease treatment, all the CHMs have to go through different processing procedures which further diversify the characters (shapes, colors, and textures) within the same category, or assimilate characters of different categories. However, the quality and representativeness of the dataset play a decisive role in the AutoML model creation [29]. That it is, a rich and representative dataset is de nitely needed to establish a high-performance CHMs identi cation model. Therefore, all the images used in this study were captured under a clear background with a single slice or non-overlapping multiple slices of the CHMs placed on light and untextured background without clutter. This enabled us to combine our professional knowledge to guide the algorithm to nd more relevant features from the CHMs images, thus, to learn more like a human. To fully investigate the feasibility of this method, an image dataset with 315 categories of commonly-used CHMs (listed in Additional le 1: Table S1) was established, which contains many images of easily-confused CHMs.
Different types of these easily-confused CHMs were summarized as follows: 1. Adulterants ( Fig. 1a1-5): fake and genuine CHMs are often mixed in the market and hard to distinguish because of their highly similar characters, such as Ziziphi Spinosae Semen ( Fig. 1a1-3) and its counterfeit Ziziphi Mauritianae Semen ( Fig. 1a4-5).
2. CHMs with similar colors (Fig. 1b1-d5): some herbs are similar in colors, such as Scrophulariae Radix (Fig. 1d3) and Rehmanniae Glutinosae Radix (Fig. 1d1), which are easily confused because of their black color after going through similar processing procedures. Some CHMs from minerals or animals with indistinctive characters can also be very hard to distinguish ( Fig. 1c1-5).
3. CHMs originated from closely related plants ( Fig. 1e1-5): These plants are highly similar in morphology, the medicinal parts of each plant become even more di cult to identify after being chopped into slices, such as CHMs from the genus Ardisia ( Fig. 1e1-5). The transverse sections of the different CHMs slices often show similar color in the bark or wood and have scattered or radial dots due to the secretory cavities.
4. CHMs applied in whole or aerial parts of the plants (Fig. 1f1-5): this type of CHMs often contains different parts of the plants-roots, stems, leaves, owers, and fruits, which often mix and require careful examination.
5. CHMs small in size ( Fig. 1g1-5): compared with other types of CHMs, this type of CHMs consists of seeds and fruits that are small or tiny in size (diameter < 5 mm).
6. CHMs using the same medicinal part -bark ( Fig. 1h1-5): they often have similar appearance and texture: at or curved in shape, the outer surface with or without scars, and the fracture surface granular or brous.
Inspired by Weng's study [17], all the images used in this study were collected by our team with a smartphone camera. For some CHMs like small seeds or fruits ( Fig. 1g1-5), a micro-lens was equipped on the top of the mobile phone camera lens to obtain high-resolution images. After eliminating low-quality and repetitive images, a total number of 31,460 CHMs images were collected for dataset construction, with about 100 images (varies from 94 to 105) in an exclusive folder for each category.

Dataset Pre-processing And Split
After the images were collected, the length-width ratio of each image was resized to 1:1. To reduce the training time of models, the resolution of each image was downscaled from 3024 × 3024 pixels or 3456 × 3456 pixels to 850 × 850 pixels.
The dataset was split into different subsets according to the hierarchical structure as shown in Fig. 2. Firstly, the original dataset was split into two subsets: one for the model building and another for the subsequent held-out test. Then, the model-building dataset was further split into a training set and an internal validation/testing set by AutoML platforms as described in the next subsection. The held-out dataset included 6260 images with a mixture of all categories, which were extracted from the original dataset and avoided exposure to the AutoML models earlier than the held-out test.

Building Image Recognition Models With Automl Platforms
Huawei ModelArts [30] provided by Huawei Cloud (Huawei, Shenzhen, China) was chosen to build our CHMs image recognition model. Its main features include: ModelArts ExeML provides a customized, code-free development platform for beginners with no coding knowledge, and it can help users to build a customized, high-precision model quickly and exibly.
Apply multiple pre-trained models and self-developed deep learning framework to build models that can achieve excellent performance using a small amount of data.
To build our CHMs image recognition models, the following steps were taken according to the website tutorials [31]: After the training job was submitted, ModelArts automatically searched for the best algorithm, neural architecture, and hyperparameters based on the training dataset.
For comparison, Baidu EasyDL [32] provided by Baidu Brain (Baidu, Beijing, China), a platform similar to Huawei ModelArts, which also provides a user-friendly interface and code-free development, was also chosen to build an image recognition model with the same training dataset.
To build our image recognition model with EasyDL, the following steps were taken for the data preparation and model con guration: Entered the classic EasyDL platform and created an image classi cation model. On the Model Training UI: Selected Public Cloud application programming interface (API) as deployment so that we can call the API to use the batch Services for the subsequent held-out test.
Selected the AutoDL Transfer as the training algorithm which is more suitable for ne classi cation scenarios like CHMs classi cation in this study.
Started the training and the dataset created for the model-building is automatically divided into a training set (70%) and a test set (30%) during the process.

Held-out Test
A subsequent held-out test was conducted with the held-out dataset after each model was built. The numbers of images used to create these two models and the held-out test were summarized in Table 1.
The held-out dataset was created with 6260 images that were avoided exposure to the AutoML models during model-training. Due to the free calls limitation of the API provided by the EasyDL platform (1000 times), 945 pictures randomly extracted from the same held-out dataset were used to evaluate the EasyDL model. The schematic representation of the AutoML model creation is depicted in Fig. 3. On the ModelArts platform, the test was carried out in a batch mode by deploying the model and using the online batch prediction services. On the EasyDL platform, the test was carried out by deploying the model to the Public Cloud, publishing the model as API, and calling the API to use batch prediction services. The goal of a machine learning algorithm is to learn from training data and predict class labels for testing data. Therefore, the assessment method is a key factor in evaluating the classi cation performance and guiding classi er modeling. In this study, four measures, namely the accuracy, the precision, the recall, and the F1-score were chosen to evaluate the models' performance. For overall model evaluation, the accuracy is one of the most commonly used measures for evaluating classi cation performance, and it is de ned as a ratio between the correctly classi ed samples number to the total number of samples. For each category, the precision represents the proportion of positive samples that were correctly classi ed to the total number of positive predicted samples; the recall represents the positive correctly classi ed samples to the total number of positive samples; the F1-score represents the harmonic mean of the precision and the recall. The F1-score value ranges from zero to one, and high values of the F1-score indicate high classi cation performance [33,34].

Manual Prediction
To investigate the similarities and differences between the AutoML model recognition and manual identi cation, three professionals were invited to identify each CHM category using the same images from the held-out dataset.

Models evaluation and held-out test
Two CHMs image recognition models were built with the same model-building dataset on ModelArts and EasyDL, respectively. After the model was built, each platform presented an evaluation report based on predicted results of the Validation Set (ModelArts) or Testing set (EasyDL). Performances of two models in terms of accuracy, F1-score, precision, and recall were summarized in Table 2. During the model evaluation, high accuracies of 99.2% and 98.4% were achieved by ModelArts and EasyDL, respectively. In the held-out tests, the accuracies of the ModelArts and the EasyDL Models were 91.21% and 91.85%, respectively. Figure 4 shows some examples of CHMs categories correctly recognized by the ModelArts model.  Table 3. Results showed that the ModelArts model performed generally well in recognizing most of the easily-confused CHMs with the average precision surpassing 0.81. To visualize the performance of the ModelArts model in recognizing easilyconfused CHMs, the F1-score values of both model evaluation and held-out test of each category were plotted into a heatmap (Fig. 5a). These results showed that the model performed well in recognizing most of the easily-confused CHMs ( Fig. 5a-g).  (Fig. 5h), which is relatively low. And there were a few CHMs categories misidenti ed by the ModelArts model (some examples shown in Fig. 6).

AutoML recognition vs. manual identi cation
To better understand the differences between the AutoML recognition model and identi cation by humans, three professionals were invited to identify all the 315 CHMs categories using the same images from the held-out dataset. In the held-out test of ModelArts, 1.90% (six categories) of CHMs: the images of Cornu Cervi Pantotrichum, Lasiosphaera Calvatia, Arisaematis Rhizoma, Cirsii Herba, Polygonati Odorati Rhizoma, and Arisaematis Rhizoma Preparatum were completely misrecognized with the precision of zero. In contrast, the false recognition rate of the manual prediction for 315 CHMs was 2.54 ± 1.58% (the mean accuracy of 97.46 ± 1.58%).

Discussion
In this study, an automated identi cation method for commonly-used CHMs was established. Firstly, a representative CHMs image dataset with more than 300 categories of CHMs has been constructed. Then, classi cation models on one-stop AutoML development platforms-Huawei ModelArts and Baidu EasyDL were built. Overall, both models performed well in recognizing different CHMs images with the accuracy surpassing 98% in the evaluation and 91% in the held-out test. Results showed that although the CHMs slices within each category varied in shapes or colors, the ModelArts model can successfully recognize them after trained using images containing key features and enough details of CHMs for identi cation (examples in Fig. 4). The difference between the accuracy of the model evaluation and the held-out test also indicates that the over tting problem exists in both models. Thus, carrying out a following held-out test is necessary.
Compared with other machine learning techniques, AutoML technology provided by ModelArts or EasyDL platform requires zero programming knowledge and offers a user-friendly interface on web-based applications. For the EasyDL platform, it possesses a unique signi cance in that, by using the data augmentation strategy on this platform, the dataset can be augmented by altering the appearance of images, and the model training effect can be enhanced. For example, images can be cropped, rotated, blurred, and ipped to optimize the model's training abilities to recognize the test images. Since the quality of the images captured by the smartphone camera is easily affected by different environmental factors such as lighting conditions, camera position, and different camera parameters, this strategy can make the model more robust under different scenarios [17]. But the detailed information including the total number of images after augmentation is not provided by the platform. As this feature is not available on the ModelArts platform, and to simplify the experimental operation, the augmentation strategy on EasyDL was set as default, that is to say, such strategy did not use to arti cially enlarge the dataset.
It is also important to note that both platforms charge for different services such as cloud storage, model training, or model deployment but offer free services for a limited time during the developing period [35,36]. For model deployment, both platforms allow users to deploy the model for serving in several ways: real-time, batch, or edge services. The real-time services allow users to deploy a model as a web service to provide real-time test UI and monitoring capabilities. The batch services can perform inference on a batch mode that allows high-throughput prediction. The Edge services provide users a complete edge computing solution, in which cloud applications are extended to the edge. By leveraging edge-cloud synergy, users can manage applications remotely and process data nearby. In this study, the batch services were applied to conduct the held-out test, in which the EasyDL gives limited 1000 times free API calls to users for batch prediction, while the ModelArts provides online batch service free in a limited time (1 hour).
Although based on our analysis we determined there was no signi cant difference between the two AutoML platforms in terms of model performance, the ModelArts model was set as the baseline model during our research for the following reasons: rstly, the time to train the ModelArts model is far shorter than the EasyDL model with the default computational setting. Although it is possible to decrease the model-training time of EasyDL by setting a different algorithm provided by the platform, this might sacri ce the precision of the model simultaneously. Secondly, as the free API calls provided by the EasyDL platform are only 1000 times, only 945 images in total (three images per category) can be uploaded for the held-out test of the EasyDL. Therefore, with far more images (6260) in the held-out dataset, the ModelArts model was set as the baseline model and the total cost for the experimental AutoML model was $0.
The performances of the experimental AutoML models in this study and the models in the previous studies were compared in Table 4 [16][17][18]. Results showed that both Weng's and Wu's models achieved high accuracies, which were 96% and 97%, respectively. However, both models were built with only 11 categories of CHMs. In another study, Sun's model built with 95 categories of CHMs achieved an average recognition precision of only 71% and all of the images used for model-building were downloaded from Google. The reason may be that each image contains multiple slices of CHMs heaped up together under complex background, which means those images often contain unrelated information that could seriously downgrade the prediction accuracy, even more, introduce bias [37]. By contrast, the models in this study are more promising in automated CHMs identi cation with a high accuracy performance using the AutoML algorithm instead of handcrafted machine learning algorithms. The method proposed in this study can greatly simplify the operation as well as reduce the time to develop a CHMs recognition model and improve its performance. Results showed that the ModelArts model can correctly recognize most of the CHMs images that are even confusing for human professionals. However, it still has certain limitations in differentiating some highly similar CHMs images such as Periplocae Cortex and Lycii Cortex. This is not surprising because they ( Fig. 1g1-2) look very similar in the images. Interestingly, we have also found that the AutoML models misrecognized some CHMs (applied using different parts of plants) images that are distinct and not di cult to identify by human professionals with material objects (Fig. 6b, h). As AutoML solutions are generally black boxes [38], we speculate that this might be due to the difference between the ways AutoML and humans learn to recognize different CHMs. As humans, we often combine multiple senses during the CHMs identi cation process and utilize different information such as the size, weight, smell, and even taste of different CHMs important features to identify them. However, AutoML learned mainly through the images we presented which only contain visual information and are limited in scope. Different CHMs may look highly similar when captured by the camera at certain angles. Sometimes there were only slice fragments in the images, which makes them harder to distinguish (Fig. 6). Eventually, results showed that the identi cation accuracy of the AutoML algorithm was close to that of three professionals. And both the AutoML algorithm and the professionals failed to identify some visuallyconfusing CHMs images. This means human professionals also found it hard to differentiate some CHMs images with little difference (Fig. 6). This is understandable since even some CHMs materials are easily confused in real life.

Conclusions
In this study, an experimental AutoML model for automated CHMs identi cation was built and evaluated using Huawei ModelArts. With high accuracies of 99.2% and 91.2% from the model evaluation and the held-out test, respectively, a conclusion can be drawn that this CHMs image recognition model has successfully learned to recognize more than 300 different CHMs. By providing user-friendly UI and exible deployments, the current state-of-art AutoML technology makes the development of the CHMs recognition model simpler and faster. Thus, this work has the potential to greatly lower the cost of commercial applications in the CHMs recognition eld and the barrier of CHMs recognition with exible deployments offered by the AutoML platforms. However, signi cant challenges still exist in the CHMs image recognition with AutoML, particularly in the recognition of easily-confused CHMs that are highly similar in images or even their material objects. To further improve the model performance, our team will enrich the training dataset by capturing images of CHMs from different angles to obtain more detailed features of each herb and applying the data augmentation strategy to make the model more robust.

List Of Abbreviations
CHMs: Chinese herbal medicines; AutoML: Automated machine learning; UI: User interface; API: Application programming interface.

Competing interests
The authors declare that they have no competing interests. Funding