Application of DNA Barcoding for the Identication of a Traditional Chinese Medicine Shedan

Shedan has a long history of application in Traditional Chinese medicine (TCM), however, Shedan from different original source has been indiscriminately used. So far, there is still a lack of an effective tool to differentiate the original source of Shedan medicinal materials, which brings great risk to the safety and effectiveness of clinical applications. Hence, it is imperative to develop a practicable approach to identify Shedan medicinal materials. The specicity of two pairs of primers, including Folmer’s universal primers and a pair of originally designed primers COISNFF/COISNFR, was tested to screen the more specic primers for further origin identication of Shedan. A total of 253 fresh snake gallbladder samples from 31 morphologically identied snake species were collected and authenticated. Moreover, 51 fresh snake bile samples and 17 fresh bile samples from ve other common domestic poultry and livestock (cattle, chicken, duck, pig and sheep) were collected and distinguished using the more specic primers. Additionally, a total of 195 market Shedan samples randomly selected from 18 batches of Shedan medicinal materials were investigated. Sequence denition was executed by querying sequence similarities in GenBank and the Barcode of Life Data System (BOLD), respectively. distinguish

2. The snake gallbladder and snake bile of Shedan crude drugs have been identi ed respectively.
3. 304 COI barcodes (658 bp) belonging to 31 snake species have been obtained and expanded the reference barcode sequences of snakes in GenBank database. 4. The original source of Shedan crude drugs from the market has been preliminarily clari ed.

Background
Shedan, a precious traditional Chinese medicine (TCM), was initially documented in the herbal Mingyi Bielu and it has been used for more than two thousand years. Previous studies have revealed that Shedan possesses satisfying therapeutic effects of clearing away heat and detoxi cation, reducing phlegm, and relieving cough, and it has been commonly utilized in the management of mycoplasma pneumonia in children [1][2][3]. Shedan is one of the major ingredients of more than 30 Chinese patent medicines, including the widely acclaimed Niuhuang Shedan Chuanbei solution [4], Pianzaihuang [5], and Shedan Chuanbei powder [6]. Although it pointed out that snake bile should be derived from snakes in the three families of Colubridae, Elapidae, or Viperidae in the Chinese Pharmacopoeia (2020), the quality standard of Shedan medicinal materials has not yet been established [7]. For decades, the demand for snakes and their relevant products is rising [8]. However, the excessive and indiscriminate hunting, habitat loss and the implementation of Law of the People's Republic of China on the Protection of Wildlife have greatly affected the snake medical supplies [9], and further lead to the market adulteration of snake drugs and snake-relevant medicinal materials [10][11]. Bile acids are the predominant biologically active ingredients in snake bile, but the bile acid pro le varied greatly among different snake species [12][13]. Clarifying the original source of Shedan medicinal materials is an essential prerequisite for ensuring safe and effective medication use, because snake bile raw material has always been derived from mixture gallbladders. So far, several conventional identi cation methods, such as character identi cation, microscopic identi cation and chemical analysis [10,12], have been developed to distinguish snake gallbladder or snake bile from different original sources, still, a more effective and dependable method is required to identify the original source of Shedan raw materials without de nite morphological features or characteristic chemical constituents.
DNA Barcoding technique [14], differentiating species by comparing the congruence between the query sequences derived from samples and reference barcodes of the known identity in public libraries, is a powerful tool for biological identi cation.
Thereinto, a fragment (~650 bp) of the barcode region in mitochondrial cytochrome c oxidase submit I (COI) gene, has been used as the most effective DNA barcode marker for identifying and classifying animal-derived medicinal materials even with highly similar or incomplete morphological traits [15][16][17][18]. Previous studies have displayed that DNA barcoding can accurately identify different snake species and distinguish snake-related medicinal materials from adulterants and substitutes through amplifying the COI sequence of the muscle tissue or periostracum serpentis [19][20][21][22][23][24], and none of these studies focused on snake gallbladder or snake bile. It was worth noting that the ampli cation primers of these studies were incompletely identical, but COI gene universal primers (LCO1490/HCO2198) [25] were mostly adopted.
In the present study, we explored the feasibility of applying COI-based DNA barcoding to identify Shedan by detecting the fresh snake gallbladder or snake bile samples, and the barcoding method developed here was further applied to authenticate the market Shedan medicinal materials.

Materials And Methods
Sample collection and processing For the fresh snake gallbladder samples, a total of 253 fresh gallbladders (Table 1) belonging to 31 snake species in three families were collected from four snake farms in Hubei, Hunan, Jiangxi, and Zhejiang province, respectively. These four provinces were located in the middle and lower reaches of the Yangtze River area where was the main Shedan producing area in China. After dissecting out from the morphologically identi ed original animals, the fresh snake gallbladders were washed with sterile water, preserved in 95% ethanol, and stored at -20°C until used for DNA extraction.
For the fresh gallbladder bile samples, a total of 68 fresh gallbladder bile samples ( bile derived from these 68 fresh gallbladders was freeze-dried into powder using lyophilizer and stored at -20°C until used for DNA extraction. For the market Shedan samples, a total of 18 batches of market Shedan medicinal materials (Table 3), were collected from Chinese medicinal material markets or commercial companies related to the production of Shedan and its preparations, such as Bozhou herb market, and Deqing Moganshan Snakes Industrial Co., Ltd. In the market, each batch of commercial Shedan contained different numbers of gallbladders, and they were mixed and stored in liquor with an alcohol content of more than 50%. The market gallbladders were divided into three categories according to their size per batch, and then approximately 1/3 of which were randomly selected. Among these chosen gallbladders from the 18 batches of market Shedan, 13 batches of gallbladder bile were used for another purpose for chemical analysis, and the rest gallbladders and the speci city of these two pairs of primers was tested by partial snake species (see Additional le 1: Table S1). PCR ampli cation was carried out in a Bio-rad T100 Thermal Cycler (Bio-rad, USA) with a 25 μL reaction mixture, which contained 2.5 μL 10× PCR Buffer, 2.5 μL dNTPs (2 mM), 1.5 μL MgSO 4 (1.5 mM), 0.5 U Taq polymerase (1 U/μL) (TOYOBO, Osaka, Japan), 0.75 μL of each forward and reverse primer (10 pmol/μL each), 15.5 μL of sterilized distilled water, and 1 μL of template DNA. The PCR ampli cation of LCO1490/HCO2198 primers was under the following conditions: 94°C for 2 min, followed by 35 cycles of 98°C for 10 s, 53°C for 1 min, and 68°C for 1 min, and a nal extension at 68°C for 5 min. And the PCR ampli cation of COISNFF/COISNFR primers was under the following conditions: 94°C for 2 min, followed by 35 cycles of 98°C for 10 s, 51°C for 50 s, and 68°C for 50 s, and a nal extension at 68°C for 5 min. The PCR products were con rmed on a 1.0% agarose gel, puri ed with the TIANGel Midi Puri cation Kit (Tiangen Biotech Co., Beijing, China), and bidirectionally sequenced using an ABI 3730XL DNA Analyzer (Applied Biosystems, USA).

Species identi cation and data analysis
Consensus sequences and contig generation were accomplished using CodonCode Aligner V 4.0 (CodonCode Co., USA). After trimming the ampli cation primers, sequences obtained were queried to GenBank and the Barcode of Life Data System (BOLD) for species identi cation, respectively, and their species would be con rmed based on the best match ≥ 98%, otherwise, the species of the query sequence could not be de ned. The average intra-and interspeci c genetic distance of the barcodes of the fresh snake gallbladder samples were calculated based on Kimura-2-parameter (K2P) distance model using MEGA 5.0 and they were used to evaluate the DNA barcoding gap. Sequences generated by COISNFF/COISNFR primers were deposited in the GenBank database.

Phylogenetic tree reconstruction
To generate the phylogenetic relationships and ascertain the accuracy of the potential barcode for species identi cation, a neighbor-joining (NJ) tree was constructed in MEGA 5.0 and the bootstrap values were evaluated based on 1000 replicates. Acrochordus javanicus from the family Acrochordidae (GenBank accession number: KX752053) [29] was selected as the outgroup in the NJ tree. To provide additional insights about the taxonomic identity of our material: we randomly downloaded one conspeci c COI barcode sequence of the 31 snake species previously identi ed by morphology from GenBank, and then analyzed them together with the barcode sequences obtained from the fresh snake gallbladder samples in the NJ tree analysis.
Investigating the market Shedan medicinal materials The DNA extraction, PCR ampli cation with COISNFF/COISNFR primers and sequencing of the market Shedan samples were the same as described above. The sequences obtained were queried to GenBank and BOLD Systems for species determination, respectively, and they were also submitted to the GenBank database. In the process of sequence de nition, we also paid attention to the similarities between the query sequences obtained from market Shedan samples and the reference barcode sequences submitted to the GenBank database by this study.

Results
Identifying the fresh snake gallbladder samples by DNA barcoding Genomic DNA was isolated from the fresh snake gallbladder tissue per sample. For the testing snake species, although the testing samples could be ampli ed with both LCO1490/HCO2198 primers and COISNFF/COISNFR primers (see Additional le 2: Figure. S1 and see Additional le 3: Figure. S2) and a standard barcode sequence could be obtained from each testing specimen, the identi cation results showed that LCO1490/HCO2198 primers were not as speci c as COISNFF/COISNFR primers to each testing snake species (see Additional le 1: Table S1). Therefore, COISNFF/COISNFR primers were selected as the optimal ampli cation primers for species identi cation of Shedan.
A total of 253 COI sequences (658 bp) were eventually generated from the fresh snake gallbladder samples and analyzed. No  Table 1.
The intra-and interspeci c genetic distance of the snake species of the fresh snake gallbladder samples based on COI barcode sequences were summarized in Table 4. The average genetic distance within species (0.9%) was much smaller than the mean genetic distance between species (20.2%), and the highest genetic distance within species (8.5%) was less than the smallest interspecies genetic diversity (9.1%), evincing a distinct barcode gap due to the no overlap between intra-and interspecies genetic distance.
One COI barcode sequence of the identical species in the same collection region was randomly selected for phylogenetic analysis, and 52 COI sequences were nally picked from the enormous 253 barcode sequences. Then, these 52 COI sequences combined with the additional 32 COI barcode sequences obtained from GenBank were used to construct a NJ tree (Fig. 1). In the tree, sequences within species were preferentially clustered together as 32 monophyletic clades with strong support (94-100) (Fig. 1), and the monophyletic clades were further consisted of four paraphyletic groups (Colubridae, Elapidae, Viperidae and Acrochordidae), demonstrating that these fresh snake gallbladder samples had been well authenticated to the species level. In other words, snake species could be authenticated by COI-based barcoding molecular method.
Distinguishing the fresh snake gallbladder bile samples using DNA barcoding Genomic DNA was extracted from the fresh snake gallbladder bile per sample. For the 51 bile samples from snakes, desired PCR products were ampli ed using COISNFF/COISNFR primers (Fig. 2). In the end, 51 COI sequences (658 bp) were attained and veri ed as 17 snake species in GenBank and BOLD Systems with both high sequence similarities (99-100%), which was consistent with their original animal species by morphological classi cation.
Authenticating the fresh gallbladder bile samples from common adulterated animals using DNA barcoding Among the 17 fresh gallbladder bile samples from the other ve common adulterated animals, genomic DNA was extracted from each fresh gallbladder bile sample. Except three duck bile samples, the rest 14 samples showed positive PCR performance with COISNFF/COISNFR primers (Fig. 3). As a result, 14 COI sequences (658 bp) were achieved and classi ed as four species (Gallus gallus, Sus scrofa, Bos taurus and Ovis aries, Table 2) in GenBank and BOLD Systems with best match ≥ 99%, which was consistent with their previous morphological taxon data.

Sensitivity of COISNFF/COISNFR primers
To detect the sensitivity of COISNFF/COISNFR primers, template DNA was diluted to a series of concentrations ranging from 100 ng/μL to 1pg/μL. The ampli cation results exhibited that the minimum effective concentration for positive ampli cation was 10 pg/μL, and no ampli cations were detected below this concentration for COISNFF/COISNFR primers (Fig. 4).
For the 61 gallbladder bile samples of market Shedan, except two bile samples with the batch number GD20150801 and one bile sample with the batch number AH20181024, genomic DNA had been extracted from the remaining bile samples, and 31 samples of which could be ampli ed with COISNFF/COISNFR primers to generate the desire PCR products. As a result, 31 COI sequences were obtained and assigned as six animal species (Table 3) based on the best match (98-100%) in GenBank and BOLD Systems, respectively, including three snake species (C. radiatus (n=2), E. carinata (n=3) and P. mucosus (n=5)) from the family Colubridae, one snake species (N. atra (n=2)) from the familyElapidae, and one snake species (X. unicolor (n=10)) from the family Xenopeltidae, and one adulterated species (G. gallus (n=9)). In total, 19.7% (12/61) of the market Shedan bile samples were derived from the rst two families recorded in the Chinese Pharmacopoeia. It was worth noting that 16.4%  (Table 3). Thereinto, the market Shedan derived from the three families (Colubridae, Elapidae and Viperidae), the family Xenopeltidae and adulterated species accounted for 73.3% (143/195), 6.7% (13/195) and 4.6% (9/195), respectively. Moreover, it was worth noting that two protected snake species (C. radiatus and X. unicolor) listed as Class II of List of key protected wild animals in China were found and they accounted for 9.2% (18/195) of market Shedan samples. The 165 COI sequences were also deposited in the GenBank database and their accession numbers were shown in In addition to being useful for identifying the fresh Shedan sample from the 31 snake species, the originally designed speci c primers COISNFF/COISNFR were also suitable for discriminating the fresh bile samples from four other common domestic poultry and livestock (cattle, chicken, pig and sheep). Moreover, 13 snake species and adulterated chicken species were identi ed in market Shedan samples, indicating that this method can be workable for the origin identi cation of Shedan medicinal materials.
In this study, the snake gallbladder and snake bile of Shedan crude drugs were investigated. It turned out that most of the market Shedan samples were identi ed to the species level, and the original species of current market Shedan medicinal materials were not only from the three families (Colubridae, Elapidae and Viperidae) stated in the Chinese Pharmacopoeia, but also from the other family Xenopeltidae. Unexpectedly, a small amount of low-value chicken gallbladders was found to be adulterated in high-value commercial Shedan medicinal materials by this study. These identi cation results revealed that the original source of Shedan crude drugs in the market is relatively complicated, and from which the key protected snake species and adulterated chicken species were detected, inditing that more attention should be paid to strengthen the protection of wild snake resources during the development and utilization of snake resources and simultaneously reinforce the supervision of the source of market Shedan medicinal materials.
Except for a few for fresh use, most medicinal materials are subject to different traditional processing procedures in time after collecting, such as fumigating, sun-drying, slicing and powdering, to ensure the quality and facilitate storage and transportation [32]. However, some conventional processes usually make the morphological characteristics blurred or even lost, which further hinders the morphological identi cation of medicinal materials. Simultaneously, DNA degradation or fragmentation might appear during processing, which signi cantly impedes the ampli cation of full-length barcodes of highly processed materials [33-34]. Moreover, although genomic DNA was extracted from the duck bile specimen and the ampli cation conditions were adjusted repeatedly, no PCR products could be ampli ed with COISNFF/COISNFR primers, which might be caused by the inability of duck template to bind with COISNFF/COISNFR primers. Therefore, some market Shedan bile samples could not be identi ed due to the failure of PCR ampli cation might be mainly caused by the serious DNA degradation, and even complete DNA degradation as three market Shedan bile samples from which no genomic DNA could be extracted, or they were from the animal species that could not be identi ed through DNA barcoding with COISNFF/COISNFR primers. In future studies, it is supposed to develop other constructive methods such as mini-barcoding method [35-37] to clarify the original source of Shedan medicinal materials more comprehensively.

Conclusion
This research has established a molecular identi cation approach of the COI-based DNA barcoding on differentiating Shedan used in TCM. Meanwhile, it also suggested that COISNFF/COISNFR primers could be used as a pair of candidate universal primers for origin identi cation of Shedan. The original source of market Shedan has been rstly reported, which provides a preliminary basis for further studies on quality control of Shedan crude drugs.    Table 4 Summary of genetic divergences (K2P model) among the snake species identi ed based on COI barcode sequences Figure 2 Ampli cation of the fresh snake bile specimens from 17 snake species with COI speci c primers COISNFF/COISNFR. 1: C.