Development of short tandem repeat (STR) and derived cleaved amplified polymorphic (dCAPS) markers for distinguishing species and varieties of the genus Panax in Vietnam

In this study, we developed a protocol for the authentication of Panax vietnamensis var. vietnamensis (Ngoc Linh ginseng) by combining two molecular markers: short tandem repeat (STR) and derived cleaved amplified polymorphic sequences (dCAPS). STR markers: Pvm30 and Pvm31 were found in the chloroplast genome of P. vietnamensis var. vietnamensis. These markers were able to accurately identify Panax stipuleanatus, Panax vietnamensis var. fuscidiscus, and Panax ginseng. Panax vietnamensis var. vietnamensis and Panax vietnamensis var. langbianensis had a high similarity of chloroplast genomic sequence (99.96%) leading to STR markers could not distinguish these two ginseng varieties. Therefore, dCAPS marker: PvmdCAPS was applied to compensate for the defect of the STR markers. From the alignment result of the matK coding sequences of these two varieties, PvmdCAPS primers were designed at the position of single nucleotide polymorphisms (SNP) at the 248th nucleotide and had the ability to discriminate between these two Panax varieties. In summary, the combination of STR and dCAPS was used to distinct Panax species in Vietnam, especially P. vietnamensis var. vietnamensis.

The appraisal of ginseng origin is now dominant to protect the interests of consumers as well as control the brand name of ginseng species (Ho and Pham 2020;Baeg 2022). In Vietnam, Panax species consisting of P. stipuleanatus H.T. Tsai et K.M. Feng (Tsai and Feng 1975); and P. vietnamensis Ha et Grushv (Ha and Grushvitzky 1985) have been reported. Based on molecular evidence of 18 S rRNA, ITS, and matK gene sequences, P. vietnamensis was divided into three varieties including P. vietnamensis var. vietnamsensis, P. vietnamensis var. fuscidiscus, and P. vietnamensis var. langbianensis (Phan et al. 2014;Nong et al. 2016;Pham et al. 2020;Duy et al. 2020). There are differences in the value of these ginsengs on the market. Panax vietnamensis var. vietnamensis (Ngoc Linh ginseng) possessed higher economic value than others. Previous studies have indicated morphological features to discriminate P. vietnamensis var. vietnamensis with other ginseng species (Ha and Grushvitzky 1985;Phan et al. 2014;Nong et al. 2016). However, ginseng root is the most interesting product in the market. Panax species in Vietnam including P. stipuleanatus, and P. vietnamensis varieties possessed bamboo-like rhizomes with highly morphological similarity (Tsai and Feng 1975;Ha and Grushvitzky 1985;Nong et al. 2016). Distinguishing the roots of the above ginseng has been challenging in practice. Therefore, it is necessary to find a technique to distinguish Ngoc Linh ginseng from others.
Simple sequence repeat (SSR), also called microsatellite, was the frequent molecular marker used in the authentication studies of ginseng species Ma et al. 2007). In addition, expressed sequence tag (EST)-SSR markers were utilized to authenticate Panax cultivars by assessing the difference of specific alleles. In previous studies, P. ginseng cultivars could be identified by one or a combination of EST-SSR markers (Cheng-Jun et al. 2008;Kim et al. 2012). However, the analysis process to search for SSR or EST-SSR markers was time-consuming and required a genomic database (Jo et al. 2017).
Another microsatellite, short tandem repeat (STR) marker, has been promising for ginseng authentication. Hon et al. (2003) found nine microsatellites could be applied to differentiate American and Oriental ginsengs through screening STR markers from over 190 ginseng samples, then proposed the applicability of STR marker in the identification of Panax species. After that, Qin et al. (2005) developed a rapid technique for identification of ginseng species based on STR marker. Currently, most Panax species have chloroplast genomes published in GenBank and polymorphic microsatellites were detected frequently in these chloroplast genomes (Kim et al. 2015Nguyen et al. 2018). Therefore, it is possible to generate STR markers and check their specificity for each Panax species or variety from chloroplast genomic data.
Recently, Nguyen et al. (2018) used derived cleaved amplified polymorphic sequences (dCAPS) markers to recognize ginseng species. The authors analyzed the chloroplast genomes from seven Panax species and discovered 1128 single nucleotide polymorphisms (SNPs) in coding gene sequences. Subsequently, eighteen dCAPS markers were designed at the SNP sites and allowed to discriminate these ginsengs from each other (Nguyen et al. 2018). In addition, Nong et al. (2016) identified a new variety of Panax in Vietnam, P. vietnamensis var. langbianensis, based on the nucleotide sequence comparison of three gene regions: ITS1-5.8 S-ITS2, 18 S rRNA, and matK. Notably, the authors listed the SNP positions of the matK gene, which could be used to design dCAPS markers for distinguishing Panax species and varieties in Vietnam. Agriculture and Life Science, Seoul National University, Korea. All samples were stored and preserved at the Incubation and Support Center for Technology and Science Enterprises, Ministry of Science and Technology (MOST).

Plant materials and DNA extraction
Total DNA extraction from the roots of ginseng species was performed using the modified cetyl trimethylammonium bromide (CTAB) method by adding 1% polyvinylpyrrolidone (PVP) (Doyle 1991). DNA quantitative was measured by NanoVue Plus™ Spectrophotometer (Biochrom) and DNA quality was analyzed by 1% agarose gel electrophoresis.

Development and selection of STR markers
The chloroplast genome of P. vietnamensis var. vietnamensis with accession number: KP036470 was used as a template to discover STR markers (Nguyen et al. 2018). Short tandem repeat sequences were searched by the online tool: Tandem repeats finder (https:// tandem. bu. edu/ trf/ trf. html) (Benson 1999) and were listed in Table 1. Next, STR marker primers were designed based on Primer3plus (https:// www. bioin forma tics. nl/ cgi-bin/ prime r3plus/ prime r3plus. cgi) with the length of polymerase chain reaction (PCR) products about 150-300 nucleotides. Afterward, primer pairs were used to perform online PCR (https:// www. bioin forma tics. org/ sms2/ pcr_ produ cts. html) with the template as the chloroplast genomes of the following ginseng species: P. stipuleanatus (KX247147), P. vietnamensis var. fuscidiscus (MT798585), P. vietnamensis var. langbianensis (MT798583), and P. ginseng (KM088019) (Nguyen et al. 2018). STR marker primer would be selected for the experiment, if the online PCR products differed by more than ten nucleotides between Panax species to analyze on agarose gel easily. PCR for STR markers was performed in a volume of 20 µL, consisting of Dream Taq Buffer 1 ×, 1 Unit DreamTaq DNA polymerase (Thermo Fisher Scientific), 1 µM of each primer, 80-100 ng of total DNA, and double-distilled water (ddH 2 O). The thermal cycle began at 94 °C for 5 min; followed by 30 cycles of 95 °C for 30s, 55 °C for 30s, and 72 °C for 30 s; and a final extension at 72 °C for 5 min. The PCR products were analyzed by 2% agarose gel electrophoresis and were stained in 0.005% EtBr for 20 min. Finally, the electrophoresis result was scanned on the BIO-RAD UV generator to capture the image.

Development dCAPS marker
The dCAPS was used to distinguish P. vietnamensis var. langbianensis and P. vietnamensis var. vietnamensis. The matK gene of these two ginsengs utilized to develop the dCAPS were amplified according to Komatsu et al. (2001) and sequenced by Humanizing Genomic Macrogen (Korea) and their accession numbers on GenBank were MW771280 and MW771281, respectively. The dCAPS marker was designed at SNP positions on the coding sequence of the matK genes. For a dCAPS primer pair, the online software dCAPS Finder 2.0 (http:// helix. wustl. edu/ dcaps/ dcaps. html) (Neff et al. 2002) was applied to create a restriction enzyme site at SNP position and introduce the first primer. The remain primer was designed using the Primer3plus (https:// www. bioin forma tics. nl/ cgi-bin/ prime r3plus/ prime r3plus. cgi). PCR for dCAPS marker had similar reaction components and thermal cycle to STR markers. Restriction enzyme reactions were performed in the volume of 20 µl containing 5 µl of PCR product, 2 µl of 10X NEBuffer, 1ul of restriction enzyme BglII, and 12 µl of ddH 2 O. The reaction mixtures were incubated at 37 °C for 90 min, then analyzed on 2% agarose gel.

Analysis of tandem repeat in the P. vietnamensis var. vietnamensis chloroplast genome
Although chloroplast genomes of P. vietnamensis were analyzed from previous studies, tandem repeat sequence data have been unnoticed. We analyzed P. vietnamensis var. vietnamensis chloroplast genome and identified 24 loci containing tandem repeat units (Table 1). Repeating sequence unit (RSU) ranged from 9 to 25 nucleotides. The repeat number reached from 1.9 to 4.6. Five loci were in the coding region and the remaining loci were not in the coding region. Percentage of matches between adjacent sequences ranged from 79 to 100 while percent of indels varied from 0 to 12 (Table 1). This suggested possible mutations in evolution of Panax species and the potential to exploit these regions in ginseng authentication research.

Identification of STR markers for Panax species authentication in Vietnam
Pvm30 was designed to amplify a region containing the RSU: CGA TAT TGA TGC TAG TGA (18 bp) repeated three times from nucleotide (nu) 92851st to nu 92910th on the chloroplast genome of P. vietnamensis var. vietnamensis. This sequence belongs to the gene ycf2 coding region (Table 1). Online PCR results of Pvm30 with chloroplast genomes of Panax species showed that P. vietnamensis var. vietnamensis and P. vietnamensis var. langbianensis had PCR product length of 250 bp containing RSU repeated 3 times. Panax stipuleanatus owned RSU reduplicated twice, so the PCR product length was 232 bp. RSU of P. ginseng was repeated 4 times and RSU of P. vietnamensis var. fuscidiscus was repeated 6 times, so their PCR product lengths were 268 and 304 bp, respectively. Sequences of Panax species obtained from online PCR with Pvm30 were shown in supplementary file S1.
Pvm31 could isolate a non-coding region containing the RSU: GAC ATT GAG TTC ATA ACA TA (20 bp) repeated twice from nu 64957th to nu 64996th on the chloroplast genome of P. vietnamensis var. vietnamensis with the online PCR product length of 176 bp (Tables 1 and 2). Panax vietnamensis var. langbianensis owned the same result as P.
vietnamensis var. vietnamensis. The remaining species had no repeats of RSU. Therefore, P. vietnamensis var. fuscidiscus and P. ginseng had the product length of 156 bp. Panax stipuleanatus was missing 3 nucleotides in the online PCR product sequence, so the length was 153 bp. Sequences of Panax species obtained from online PCR with Pvm31 were shown in supplementary file S2.
Electrophoresis results showed the band length of Panax species was similar to the online PCR results of Pvm30 and Pvm31 (Fig. 1). Thus, Pvm30 and Pvm31 had the ability to recognize most ginseng species such as P. vietnamensis var. fuscidiscus and P. stipuleanatus in Vietnam.
Identification of dCAPS marker to complete the protocol for P. vietnamensis var. vietnamensis authentication Due to the high similarity in chloroplast genomes of P. vietnamensis var. vietnamensis and P. vietnamensis var. langbianensis (99.96%), STR markers were not able to separate these Panax varieties. Therefore, the dCAPS marker was used to distinguish these two ginsengs. Alignment of their matK gene coding sequence showed only one SNP at position 248 (supplementary file S3). PvmdCAPS primers, a dCAPS marker designed in this study, contained a cleavage site of the restriction enzyme BglII: AGA TCT , where "G" was the SNP of P. vietnamensis var. vietnamensis, and "C" was nu modified from the original nu "T" to create the BglII cleavage site (Table 3). The PCR product size of these two varieties was 187 bp. After performing the restriction enzyme reactions, P. vietnamensis var. vietnamensis had a 161 bp band on the agarose    (Fig. 2a). Five randomly samples of each P. vietnamensis var vietnamensis and var langbianensis were collected to evaluate the effectiveness of the dCAPS. All obtained samples were performed PCR successfully. After the restriction enzyme BglII reaction, 5 samples were identified as P. vietnamensis var vietnamensis in lanes 3, 4, 5, 7, and 8. The remaining ginseng samples were P. vietnamensis var langbianensis (Fig. 2b).

Discussion
In Vietnam, Panax vietnamensis var. vietnamensis (Ngoc Linh ginseng) has high economic value in the market, but the method to authenticate with other Panax species is still lacking. Thus, the identification of P. vietnamensis var. vietnamensis has been important to protect the brand and control production activities related to this ginseng variety (Ho and Pham 2020). Molecular markers used to distinguish Panax species have been developed. Panax ginseng and P. quinquefolius could be differentiated from each other by polymorphic markers such as RAPD, AFLP, PCR-AFLP, and SSR (Shaw and But 1995;Ngan et al. 1999;Um et al. 2001;Ha et al. 2002;Jung et al. 2014). However, these techniques have not been developed to differentiate many types of ginsengs and were mainly used to assess genetic diversity (Jo et al. 2017). Although EST-SSR markers were used to differentiate P. ginseng cultivars, the development and selection process was time-consuming and required genomic data to design promising markers (Cheng-Jun et al. 2008;Kim et al. 2012;Jo et al. 2017). Remarkably, a rapid method for the authentication of three ginsengs: P. ginseng, P. quinquefolius, and P. notoginseng was developed based on STR marker (Qin et al. 2005). This marker had also been suggested to become a potential technique for the identification of Panax species (Hon et al. 2003). In addition, the dCAPS markers were designed from SNPs belonging to the coding sequence of the chloroplast genome of 7 Panax species, which allowed to discern these ginsengs from each other (Nguyen et al. 2018).
In this study, we proposed a method to distinguish Panax species and varieties in Vietnam by combining STR and dCAPS markers. Two STR markers: Pvm30 and Pvm31 showed effectiveness in species identification. Pvm31 divided 5 Panax species into 2 groups, the first group includes P. vietnamensis var. vietnamensis and var. langbianensis, the remaining species belong to the other group. Besides, Pvm30 was used to separate P. stipuleanatus, P. vietnamensis var. fuscidiscus, and P. ginseng based on differences in PCR product lengths. Although P. vietnamensis varieties could not be distinguished in the first group, these markers were still valuable in identifying the remaining ginseng species.
Comparing the chloroplast genome sequence of P. vietnamensis var. vietnamensis (KP036470) and var. langbianensis (MT798583) showed a similarity of 99.96% leading to STR markers could not differentiate these two ginsengs. Therefore, we have developed the dCAPS marker to discern these two varieties. The coding sequence alignment of the matK gene data showed an SNP at position 248. The PvmdCAPS marker was designed based on this position and differentiated two P. vietnamensis varieties. This result was also the first time that the dCAPS had been used to distinguish ginseng varieties.

Author contributions
The authors confirm contributions to the article as follows: studyconception and design: DXT and NML; data collection: DXT and MXC; analysis andinterpretation of results: MXC and DXT; draft manuscript preparation: NML and MXC.All authors reviewed the results and approved the final version of themanuscript.
Funding None.
Data availability All data are available from the corresponding author upon reasonable request.

Declarations
Conflict of interest All authors declare that they have no conflicts of interest.