Comparative plastomes and phylogenetic analysis of seven Korean endemic Saussurea (Asteraceae)

DOI: https://doi.org/10.21203/rs.3.rs-1471846/v1

Abstract

Background: Saussurea is one of the most species-rich genera in the Cardueae, Asteraceae. There are approximately 40 Saussurea species distributed in Korea, with nearly 40% of them endemics. Infrageneric relationships remain uncertain due to insufficient resolutions and low statistical support. In this study, we sequenced the plastid genomes of five Korean endemic Saussurea (S. albifolia, S. calcicola, S. diamantica, S. grandicapitula, and S. seoulensis), and comparative analyses including two other endemics (S. chabyoungsanica and S. polylepis) were conducted.  

Results: The plastomes of Korean endemics were highly conserved in gene content, order, and numbers. Exceptionally, S. diamantica had mitochondrial DNA sequences including two tRNAs in SSC region. There were no significant differences of the type and numbers of SSRs among the seven Korean endemics except in S. seoulensis. Nine mutation hotspots with high nucleotide diversity value (Pi > 0.0033) were identified, and phylogenetic analysis suggested that those Korean endemic species most likely evolved several times from diverse lineages within the genus. Moreover, molecular dating estimated that the Korean endemic species diverged since the late Miocene.

Conclusions: This study provides insight into understanding the plastome evolution and evolutionary relationships of highly complex species of Saussurea in Korean peninsula.

Background

Saussurea DC. (ca. 400 species) is one of the most abundant genera of the tribe Cardueae (Asteraceae) and is adapted to cool temperate and arctic regions of Asia, Europe, and North America [1, 2]. Lipschitz (1979) classified Saussurea into six subgenera according to morphological characteristics: Amphilaena (Stschegl.) Lipsch., Eriocoryne (DC.) Hook. f., Frolovia (DC.) Lipsch, Jurinocera (Baill.) Lipsch., Saussurea DC., and Theodorea (Cass.) Lipsch. However, several phylogenetic studies based on morphological traits and molecular markers have provided evidence to designate subg. Jurinocera, subg. Frolovia, subg. Saussurea sect. Elatae, subg. Saussurea sect. Aucklandia, and subg. Saussurea sect Jacea as new genera [36]. Subsequently, Saussurea has recently been recognized as four subgenera (Amphilaena, Eriocoryne, Saussurea, and Theodorea) [6]. Although several phylogenetic studies have been conducted using nuclear and plastid loci as molecular markers [35, 7], relationships within Saussurea were poorly resolved due to rapid adaptive radiation and convergent evolution [5, 7, 8].

Recently, the advent of next generation sequencing technologies has led to rapidly accumulation of genomic data. Because of conserved structure, non-recombinant traits, and greater variability than the mitochondrion genome, plastid genome (plastome) regions have consistently been used as a robust tool in phylogenetic studies [911]. Furthermore, studies using complete plastomes have offered new insights into phylogenetic relationships and the diversification histories of species [1214]. Although studies on phylogenetic relationships, origins, and evolution using plastomes of Saussurea have been reported [15, 16], the limited number of Korean species was used in the previous studies, and there is a need for a better understanding the relationship among Korean species.

Saussurea is one of the rich species groups in the flora of Korea. The approximately 40 Saussurea species distributed in Korea comprise 2 subgenera, Theodorea and Saussurea, and nearly 40% are endemic species belonging to subg. Saussurea: S. calcicola Nakai, S. chabyoungsanica Im, S. chinnampoensis H. Lév. & Vaniot, S. conandrifolia Nakai, S. diamantica Nakai, S. eriophylla Nakai, S. grandicapitula W. Lee et H. T. Im, S. koidzumiana Kitam., S. macrolepis (Nakai) Kitam., S. myokoensis Kitam., S. polylepis Nakai, S. rorinsanensis Nakai, S. seoulensis Nakai, and S. uchiyamana Nakai [17]. Obtaining a rigorous phylogenetic framework for Saussurea species in Korea has been exceptionally challenging due to sampling difficulties and insufficient levels of resolutions and degree of statistical support. In particular, the phylogenetic tree based on chloroplast (cp) DNA markers (e.g., trnL–trnF and trnH–psbA) commonly used as barcodes has not yet been clearly resolved, showing multiple polytomies (Yun and Kim, unpublished data).

In this study, plastomes of five Korean endemic species, S. albifolia, S. calcicola, S. diamantica, S. grandicapitula, and S. seoulensis, were sequenced and comparative analyses were conducted, including two previously reported species S. chabyoungsanica [18] and S. polylepis [19]. Saussurea albifolia, which is described recently as a new species, has cordate or deltoid-cordate leaves with white or yellowish hair on abaxial surface. In addition, the campanulate involucre has brown-cobwebby hair and the tips of phyllaries do not recurve [20]. Saussurea calcicola has large leaves with cobwebby hair on the underside, and wings on the petiole. It grows to approximately a height of 1 m in limestone regions. Saussurea grandicapitula has cobwebby hairs on the petioles of the radical and lower cauline leaves, big globose involucres with brown-cobwebby hairs, and recurved phyllaries [21]. Saussurea albifolia, S. diamantica, and S. seoulensis share common traits including basal rosette leaves, cobwebby hair on abaxial leaf surfaces, and white or yellowish hairs on the involucre. However, S. diamantica has recurved involucres, and S. albifolia has larger involucral width than S. diamantica and yellowish cobwebby hair on the abaxial surface of leaf. Saussurea seoulensis has a distinctive bell-shaped involucre, the largest such structure among the species [17]. The most notable differences between S. chabyoungsanica and other Korean Saussurea species are long lanceolate leaves with short petioles and a compact corymb [22]. Saussurea polylepis is distinguishable by its glossy and reniform leaves. Based on several diagnostic morphlogical features of Saussurea, congeneric species are distinguished, but variable molecular markers through comparison of plastomes of endemic species can overcome low resolution shown in the previous phylogenetic study (Yun and Kim, unpublished data). In addition, identifying the structure and characteristics of plastomes of the Korean endemic Saussurea species will provide insights into understanding the plastome evolution of Saussurea.

The aims of this study were (1) to determine five complete plastomes of Korean endemic species, (2) to identify divergent sequence hotspots for development of informative cpDNA markers, (3) to gain insight into the evolution of Saussurea plastomes, including structural differences and molecular evolutionary patterns, and (4) to reconstruct phylogenetic relationships among the Korean endemic Saussurea species and estimate their divergence times using plastomes.

Results

Characteristics of plastomes

Newly sequenced plastomes of the five species had a total length of 152,435 (S. albifolia) – 173,114 bp (S. diamantica) (Fig. 1 and Table 1). Because some parts of mitochondrial DNA sequences including two tRNAs were inserted in the small single copy (SSC) region (ndhF-pseudo ycf1) of plastome of S. diamantica, the total length of S. diamantica was longer than that of other Saussurea species by 20,550 bp (Fig. 1b and Table 1). BLAST searches were performed to determine the characteristics of the insertion. The result demonstrated that the inserted sequences were highly matched to Chrysanthemum, Diplostephium, Lactuca, Helianthus, and Paraprenanthes mitochondrial DNA sequences, ranging from 5,159 bp (Paraprenanthes diversifolia, MN661146) to 7,090 bp (Diplostephium hartwegii, KX063855), but this does not mean the continuous consistency of the whole 20,550 bp on the mitochondrial genome. The transfer of mitochondrial DNA sequences to plastoms has been reported in the families Apiaceae, Apocynaceae, and Poaceae [2325]. The previous studies indicated that genes of less than 3 kb of mitochondrial DNA are inserted into the IR or LSC regions. Given that a plastome is highly conserved, the large insertion of mitochondrial DNA sequences is an unusual event. Thus, confirmation is needed that the insertion was not merely a product of assembling error. By comparing sequencing depth of before and after the insertion, Ma et al. [25] confirmed that it is not a product of misassembly. Because the plastome occupies the smallest portion of the genomic DNA, it can be easily distinguished from mitochondrial and nuclear DNA sequences by sequencing depth. Ma et al. [25] also inferred that the nuclear and mitochondrial genomes are larger than the plastome and would have a lower sequencing depth. The sequencing coverage of S. diamantica was evaluated instead of the sequencing depth suggested by Ma et al. [25]. The average sequencing coverage was 322.3 and that of the inserted regions was 356.8. The average sequencing coverage of the surrounding regions, which was 349.1 and 346.9 before and after the insertion with 200 bp, is similar to that of the inserted region. These results indicated that misassembly is not a cause of mitochondrial DNA insertion into the plastome of S. diamantica. In addition, PCR amplification and Sanger sequencing were conducted to confirm the insertion using designed two primer sets (SD1f: GTAGGGGGTGGGCGTATTTC, SD1r: GATGTCGAGTGCCGCTTTTC and SD2f: AGGGTGATGCTTGGCTTCT, SD2r: TTTTCGTGGTTAGAGCGGCT), and amplifications were successful but not for other species (data not shown). It also supported that there was no error in the assembly process.

Seven Saussurea species have a typical quadripartite structure that comprised a pair of inverted repeats (IR: IRA and IRB) of 25,185–25,893 bp, which were separated by SSC of 18,671–37,846 bp and large single copy (LSC) of 83,387–83,482 bp. The largest insertion (62 bp) was identified in the trnE–rpoB intergenic region of LSC of S. grandicapitula, S. polylepis, and S. seoulensis. Other than S. diamantica, length of IR, SSC, and LSC of each species was similar. The overall GC content was 37.7% and the GC content in the IR, SSC, and LSC was respectively 43.1%, 31.4%, and 35.8%, except for S. diamantica (overall GC: 38.6%, IR: 42.8%, SSC: 39.1%, LSC: 35.8%) (Table 1).

Seven Saussurea plastomes possessed the two inversions in LSC region like other Asteraceae. The seven plastomes depicted high similarity at the LSC, IR, and SSC boundaries (Fig. S1). Rps19 was across the LSC–IRB boundary without any change in sequence length, and trnH–GUG was located three base pairs away from the LSC–IRA boundary in all species. The ycf1 gene crossed SSC–IRB, with 4,022–4,740 bp within the SSC region and 561–1,240 bp within the IRB region. Ycf1 was a pseudogene, located at the SSC–IRA boundary, with 6–752 bp within the SSC region and with 561–1,240 bp within the IRA. In particular, S. diamantica had longer ycf1 (pseudogene) than others.

The seven plastomes contained 114 identical genes, including 80 protein-coding genes, 30 tRNA, and four rRNA genes. Eighteen genes including 12 protein–coding genes (atpF, clpP, ndhA, ndhB, petB, petD, rpl2, rpl16, rpoC1, rps12, rps16, and ycf3) and 6 tRNA genes (trnA–UGC, trnG–UCC, trnI–GAU, trnK–UUU, trnL–UAA, and trnV–UAC) had intron, and 17 genes (ndhB, rrn4.5, rrn5, rrn16, rrn23, trnA–UGC, trnI–CAU, trnI–GAU, trnL–CAA, trnN–GUU, trnR–ACG, trnV–GAC, rpl2, rpl23, rps7, ycf2, and ycf15) were duplicated in IR regions (Table S1). However, S. diamantica had two additional tRNA genes (trnC–GCA and trnM–CAU) from the mitochondrial genome in the SSC region. The formation of tertiary structure of two tRNAs was confirmed by simulation through the tRNAscan-SE 2.0.

 

 
 
Table 1

Summary of seven Saussurea plastid genomes

 

S. albifolia

S. calcicola

S. chabyoungsanica

S. diamantica

S. grandicapitula

S. polylepis

S. seoulensis

Reference

This study

This study

Cheon et al. [18]

This study

This study

Yun et al. [19]

This study

Total length (bp)

152,435

152,453

152,446

173,114

152,484

152,488

152,578

LSC length (bp)

83,387

83,399

83,397

83,482

83,431

83,417

83,441

SSC length (bp)

18,678

18,672

18,679

37,846

18,671

18,689

18,727

IRa length (bp)

25,185

25,191

25,185

25,893

25,191

25,191

25,205

Total GC content (%)

37.7

37.7

37.7

38.6

37.7

37.7

37.7

GC content of LSC

35.8

35.8

35.8

35.8

35.8

35.8

35.8

GC content of SSC

31.4

31.4

31.4

39.1

31.4

31.4

31.4

GC content of IR

43.1

43.1

43.1

42.8

43.1

43.1

43.1

Protein-coding gene number

80

80

80

80

80

80

80

rRNA gene number

4

4

4

4

4

4

4

tRNA gene number

30

30

30

32*

30

30

30

Total gene number

114

114

114

116

114

114

114

Identification of variable regions

SNP patterns that can be divided into 2 groups were found in 42 regions (Table S2). Of them, 34 regions were in LSC, followed by SSC and IR. Based on S. involucrata as a reference, there were no large differences among the seven Korean endemics. The LSC and SSC regions were more divergent than IR regions, and the coding regions were more conserved than the non-coding regions (Fig. S2). However, coding regions such as rbcL in the LSC region, ycf1 in the SSC region, and ycf2 in the IR regions showed variability.

The nucleotide diversity (Pi) ranged from 0 to 0.0053 (Fig. 2). The IR region had a relatively low nucleotide diversity value, ranging from 0 to 0.00286. We detected nine divergence hotspots with Pi values over 0.0033. Among them, seven were located in the LSC region, and two were located in the SSC region. Other than ycf1, the variable regions were concentrated in intergenic spaces. The hotspot with the highest Pi was ycf4–cemA (0.0051), followed by seven intergenic regions (psbC–trnS, rbcL–accD–psaI, rpl32–ndhF, trnT–trnD, psbE–petL, rps4–trnT–trnL, and rpl16–trnQ–psbK) and one gene region (ycf1).

Simple sequence repeats (SSRs) analysis

Five categories (mononucleotide, dinucleotide, trinucleotide, pentanucleotide, and hexanucleotide) of SSRs were detected, and the types and numbers of SSRs were similar across the seven Saussurea (Fig. 3). The total number of SSRs was 72 in S. albifolia, 75 in S. calcicola, 74 in S. chabyoungsanica, 76 in S. diamantica, 77 in S. grandicapitula, 78 in S. polylepis, and 82 in S. seoulensis. The detected SSRs were mainly located in the LSC region (67.1–77%) and distributed in the IR and SSC regions ranging from 9.8–11.1% and from 12.2–22.4%, respectively. Twenty-three of the SSRs detected from the seven Saussurea were located in 15 genes (cemA, ndhB, petA, psaA, psbC, rbcL, rpoA, rpoB, rpoC1, rpoC2, rps15, rrn23, trnS–UGA, ycf1, and ycf2) with 3–10 repeat numbers (Table S3). The most abundant type was mononucleotides A/T and species–specific SSR was identified from S. seoulensis as hexanucleotides TACAAA/TTTGTA.

Synonymous and non-synonymous substitution rate analysis

The non-synonymous (Ka) to synonymous (Ks) substitution rate ratio (Ka/Ks) has been used to determine whether protein-coding genes are subjected to selective pressure. If Ka/Ks is greater than 1, it could indicate that it is under positive selection [26]. We calculated synonymous and nonsynonymous substitution rates between S. involucrata and Korean endemic Saussurea (Fig. S3). Approximately 90% of protein coding genes were below 1 in Ka/Ks values. In seven Korean species, the Ka/Ks value was close to zero at 12 protein coding genes (clpP, ndhB, ndhH, petD, psaC, psbA, psbB, psbC, psbD, rpoB, rps11, and rps15) while that of five protein coding genes (ndhI, psaJ, psbL, rpl33, and ycf2) were 50, indicating positive selection influenced the differentiation of Saussurea.

Codon usage analysis

We detected similar patterns in the frequency of codon usage of seven Korean endemics. The 80 annotated protein-coding genes are encoded by 22,739 codons in S. albifolia, 22,831 in S. calcicola and S. polylepis, 22,826 in S. chabyoungsanica, 22,821 in S. diamantica, and 22,835 in S. grandicapitula, and 22,834 in S. seoulensis (Table S4). Leucine was the most abundant amino acid (10.6%), whereas cysteine was the least (1.1%). The most used synonymous codon was ATT, encoding isoleucine, and the least used was TGC, encoding cysteine. Usage of the start codon methionine (ATG) and tryptophan (TGG) had no biases (relative synonymous codon usage, RSCU = 1). All preferred relative synonymous codons (RSCU > 1) ended with an A or a T, other than TTG (leucine) (Fig. 4). The tendency for codon preference was similar among species. Of 61 codons (except for stop codon), 14 (Ala–GCT, Arg–AGA, Asn–AAT, Asp–GAT, Gln–CAA, Gly–GGA, His–CAT, Leu–TTA, Lys–AAA, Pro–CCT, Ser–TCT, Thr–ACT, Tyr–TAT, and Val–GTA) were highly preferred (RSCU > 1.5).

Phylogenetic analysis and molecular age estimation

In this study, 32 plastomes were used to determine the phylogenetic relationships among Korean endemic Saussurea. As a result, the higher resolution phylogenetic tree showed that Saussurea based on the current sampling is not monophyletic (Fig. 5). Of the seven endemic species, S. diamantica diverged first. The morphologically similar S. albifolia and S. seoulensis did not form a sister relationship; S. albifolia formed a sister with the group including S. odontolepis, S. bullockii, S. tianshuiensis, and S. chabyoungsanica. Limestone endemic, S. calcicola, shared its common ancestor with the group consisting of S. brachycephala, S. amurensis, S. polylepis, S. grandicapitula, S. seoulensis. S. kuschakewiczii, S. leucophylla, S. tomentosa, S. komaroviana, and S. subtriangulata. However, relatively low bootstrap values hindered us to determine precisely their phylogenetic relationships. Also, S. chabyoungsanica, which is narrow limestone endemic to central Korea, is sister to S. tianshuiensis, which occurs narrowly in high montane meadows (1800–2500 m) in three provinces of northwestern China (i.e., SE Gansu, Shaanxi, and Ningxia).

The molecular age estimation suggested that endemic Korean Saussurea originated in the late Miocene (Tortonian), with the estimated crown age of approximately 9 million years ago (95% HPD, 3.03–18.8 million years ago, MYA) (Fig. 6). The clade containing all but S. diamantica, which has unusual mitochondrial DNA sequences insertion, was estimated to be 6.18 MYA (95% HPD, 2.14–13.28 MYA). Two major lineages of the Korean endemics, i.e., S. albifoliaS. chabyoungsanica and S. calcicolaS. seoulensisS. polylepisS. grandicapitula, appear to be speciated even more recently, during the Pleistocene.

 

Discussion

Characterization of the Korean endemic plastid genome

Accumulation of plastome data from various land plants has improved our understanding of plant evolution. In general, the plastome has a highly conserved structure; a single circular DNA molecule is composed of a large single copy, a small single copy, and two copies of inverted repeats. However, structural rearrangements, gene loss, IR expansion and contraction, inversion, and gene transfer occur in certain species or lineages [27]. In this study, we sequenced the complete plastomes of five Korean endemic Saussurea species and are compared their characteristics including two previous sequenced endemic species, S. chabyougnsanica and S. polylepis. Like other angiosperms, they have a quadripartite structure and are highly conserved in gene order, gene content, and gene number. The expansion and contraction of the IR regions have been affected by length variation in plastomes [2829]. In addition, smaller expansions and contractions of the IR regions are a common phenomenon in angiosperms [3031]. The IR lengths between the seven endemic species are slightly different due to insertions and deletions in ycf2 and intergenic spaces of ycf2-trnL-CAA, trnL-CAA-ndhB, ycf15-trnV-GAC, and trnR-ACG-trnN-GUU. In particular, insertion or deletion in ycf2 sequences is well reported in a previous study [32]. Many insertions and deletions can be shared by certain groups (e.g., Brassicales, Saxifragales). Moreover, multiple insertions and deletions that are highly conserved in ycf2 can be observed to be shared by more distantly related taxa [32]. The distribution of those insertions and deletions may relate to multiple evolutionary histories, such as differential gene losses and horizontal gene transfer. Concerning that there are differently shared patterns of insertions and deletions as well as SNPs among Korean endemic species, it can support those Korean endemics are originated from multiple lineages.

Interestingly, we found that S. diamantica has mitochondrial DNA sequences including two tRNAs in SSC region. DNA transfer between nuclear and organellar (plastid and mitochondria) genomes has been reported in several taxa. Most prominent transfer is from organellar genomes into the nuclear genomes [33, 34]. Also, there are previous studies reporting gene transfer between organelle genomes [23, 25, 35]. In particular, the transfer of mitochondrial DNA sequences to plastomes has been reported in the families Apiaceae, Apocynaceae, and Poaceae [2325]. However, the evidence of DNA transfer between organellar genomes have been not reported in Asteraceae. In this study, our result of BLAST search demonstrated that the inserted sequences were highly matched to Chrysanthemum, Diplostephium, Lactuca, Helianthus, and Paraprenanthes mitochondrial DNA sequences, but this does not mean the continuous consistency of the whole 20,550bp on the mitochondrial genome. Even though it is difficult to reveal whether the exact mechanism is the result of deletion after insertion of long DNA sequences or the result of multiple transfers. The insertion of mitochondrial DNA into the plastome is not common at the species level, so these conclusions require verification through the further studies. Information on the mitochondrial genome in Saussurea will improve understanding of the insertion event and genome evolution overall.

Nucleotide diversity and selection pressure

The Pi values can provide useful information for marker development for phylogenetic analysis and DNA barcoding. The effectiveness of newly developed markers based on nucleotide diversity values has been verified through the discrimination of closely related species [36, 37]. In seven Korean species, Pi ranged from 0 to 0.0053 (Fig. 2), indicating high similarity of sequences among seven Saussurea species. These low values were reported in Sonchus (Asteraceae) ranged from 0 to 0.006 [38] and Meconopsis (Papaveraceae) ranged from 0 to 0.007 [39]. The high similarity might be related to the recent speciation.

The hotspot with the highest Pi was ycf4–cemA (0.0051), followed by seven intergenic regions (psbC–trnS, rbcL–accD–psaI, rpl32–ndhF, trnT–trnD, psbE–petL, rps4–trnT–trnL, and rpl16–trnQ–psbK) and one gene region (ycf1). Of the nine variable regions detected in this study, rpl16–trnQ, trnT–psbD, rps4–trnT–trnL, accD–psaI, psbE–petL, and rpl32–ndhF coincide with the variable cp regions identified by Shaw et al. [41, 41] and trnT–psbD, rps4–trnT–trnL, accD–psaI, psbE–petL, and rpl32–ndhF regions have been utilized in conducting phylogenetic studies in many taxa [4246]. Previous studies used in cp molecular markers, such as trnL–trnF, psbA–trnH, and matK, have been poorly resolved in Saussurea [35]. The low resolution of phylogenetic studies can be interpreted as the low diversity values of trnL-trnF, psbA-trnH, and matK markers. Therefore, using these identified variable regions will be helpful for further clarifying phylogenetic relationships.

As microsatellites or SSR markers have hyper-mutation rates and polymorphism, they are suitable for population genetic analyses such as population genetic structures and gene flow patterns. In particular, Powell et al. [47] suggested that the SSR markers from the chloroplast are useful for acquiring insight into gene flow related to seed and pollen dispersal, genetic structure, nuclear-chloroplast interaction, and the origin of polyploidy. Many studies have reported the use of cpSSR markers with high polymorphisms. For example, Vendramin et al. [48] assessed the genetic variation among Abies alba populations using 2 cpSSRs, while Cubas et al. [49] evaluated the genetic variability and relationships among Ulex species using 6 cpSSRs. The detected SSRs were mainly located in the LSC region (67.1–77%), which is consistent with the characteristics found in other angiosperms [38, 50]. The most abundant type was mononucleotides A/T, which is consistent with the results obtained in previous studies [51, 52]. As the plastomes of seven Saussurea were conservative, SSR primers can be transferable across species and genera. Therefore, information involving SSRs in this study could be useful for studies at the population level and provide complementary data to the SSR markers of Saussurea identified from the nuclear genome [53].

If Ka/Ks is greater than 1, it could indicate that it is under positive selection [26]. Approximately 90% of protein coding genes were below 1 in Ka/Ks values. These results indicate that the protein-coding genes may have undergone purifying selection pressure during their evolution, and it is consistent with the typical tendency shown in the plastid genes [54].

Codon usage bias can also improve our understanding of the effects of natural selection during the evolution [55, 56]. If selective pressure or mutation preferences are absent, synonymous codons prefer equally, and the nucleotide mutations at each amino acid site occur randomly [57]. The tendency for codon preference was similar among species, indicating relatively conserved characteristics of plastome. There were more codons with the RSCU value less than one ended with base C or G and there was high A/T preference in the third codon. These are a common phenomenon in plastomes of vascular plants.

Phylogenetic relationships

Although Korean endemic Saussurea species can be distinguished by morphological characteristics, the low resolution of the previous phylogenetic study (Yun and Kim, unpublished data) was insufficient to understand the relationships between species. In the phylogenetic tree using complete plastomes, Korean endemics were not monophyletic. It is plausible that few independent lineages of Saussurea involved in the origin of several endemic species in Korea. In addition, we found that the morphologically similar S. albifolia and S. seoulensis did not form a sister relationship; S. albifolia formed a sister with the group including S. odontolepis, S. bullockii, S. tianshuiensis, and S. chabyoungsanica. This may suggest that the morphological similarities between S. albifolia and S. seoulensis could be due to convergent evolution or parallelism. Nevertheless, when the current Korean phylogenetic framework was compared to the previous broader phylogenomic study [16], our result also showed the same relationships between S. polylepis and S. amurensis, and S. chabyoungsanica and S. tianshuiensis. However, the clade including the Korean endemics has still low support values (bootstrap support value < 50). Although the complete plastome sequences were utilized to build baseline phylogenetic framework among the Korean endemics, recent explosive speciation of this group perhaps contributed to insufficient resolutions in species relationships.

As for the molecular age estimation based on much broader phylogenomic study [16], the MRCA of S. polylepis and S. amurensis was estimated to be 4.8 MYA, while the MRCA of S. chabyoungsanica and S. tianshuiensis was 0.32 MYA. Therefore, these age estimates are concordant with the current study. Like precise phylogenetic relationships among species of Saussurea in East Asia require further study, accurate molecular age estimation of Korean endemic lineages is needed. In addition, it is yet to be determined whether climatic oscillations during the Pleistocene were major evolutionary drivers in speciation of Saussurea [5860].

Conclusions

In this study, the complete plastomes of five Korean endemic Saussurea were reported and comparatively analyzed these data including previously reported species, S. chabyoungsanica and S. polylepis. The structures of the plastomes were generally conserved, sharing most genomic features despite the different morphological diversity, but we found the mitochondrial DNA sequences insertion in S. diamantica. Through the comparative analyses including variable regions, SSRs, Ka/Ks value, and codon usage, we identified the species-specific SSRs, different patterns of Ka/Ks among species, and nine hotspot regions. These resources will provide insight into the evolution of Korean Saussurea and plastome molecular markers for the identification among Saussurea species. The phylogenetic tree indicated Korean endemic species did not originate from one lineage.

Materials And Methods

Plant materials and DNA extraction

Fresh leaves of Saussurea albifolia, S. calcicola, S. dimantica, S. grandicapitula, and S. seoulensis were sampled from natural populations in South Korea and dried with silica gel. All voucher specimens were deposited in the Ha Eun Herbarium, at Sungkyunkwan University (SKK) (Table S5). Total genomic DNA was extracted using the DNeasy Plant Mini Kit (Qiagen, Carlsbad, California, USA) according to the manufacturer’s instructions.

DNA Sequencing, genome assembly, and annotation

After conducting quality control, qualified samples proceeded to library construction. Paired-end sequencing was performed on the Illumina Hiseq 4000 (Illumina Inc., San Diego, California, USA) at the Macrogen Corporation (Seoul, Korea). For each species, approximately 3.0 GB raw data were generated. The raw reads were assembled de novo into whole plastomes using Velvet v. 1.2.10 with multiple k-mers [61]. Dual Organellar GenoMe Annotator (DOGMA) software [62] and tRNAscan-SE [63] were used to annotate the protein coding genes and transfer RNA genes, respectively. The graphical maps of plastomes were drawn in the Organellar Genome DRAW (OGDRAW) program [64]. The five annotated complete plastome sequences were submitted to GenBank (Table S5).

Comparison of plastid genomes

The complete plastomes of the five new species and two previously reported Korean endemic S. chabyoungsanica (NC036677) and S. polylepis (NC036490) were aligned and adjusted manually using Geneious v.10.2.2. (Biomatters Ltd., Auckland, New Zealand). A large insertion (20,550 bp) found in S. diamantica was excluded for analysis. The complete plastomes of the seven Korean endemic Saussurea were compared using mVISTA [65] with Shuffle-LAGAN mode [66] and default parameters. The plastome of S. involucrata (NC029465) was used as a reference. Sliding window analysis was carried out to calculate the nucleotide diversity (Pi) using DnaSP v. 6 [67]. The step size was set to 300 bp, with a 900 bp window length. SSRs were detected using MISA [68]. The minimum repeat thresholds were set to ten for mononucleotide repeats, four for dinucleotide to tetranucleotide repeats, and three for pentanucleotide and hexanucleotide repeats. Sequences of 80 protein-coding regions without stop-codons were extracted from the plastomes of seven Saussurea and S. involucrata as a reference. KaKs_Calculator 2.0 [69] was used for calculating Ka/Ks values with genetic code 11 (bacterial and plant plastid code) and GY as a calculation mode. The codon usage frequencies and RSCU values for 80 protein-coding genes were determined with DnaSP v.6. The RSCU was divided into four models, including lack of preference (RSCU ≤ 1.0), low preference (1.0 < RSCU < 1.3), moderate preference (1.30 ≤ RSCU ≤ 1.50), and high preference (RSCU > 1.5) [57].

Phylogenetic analysis and estimation of divergence times

For the phylogenetic analysis, the plastomes of 28 Saussurea species including 7 Korean endemics and 4 representative species from 4 genera (Arctium, Carthamus, Centaurea, and Hemistepta) as an outgroup were used. Twenty-one additional Saussurea species were selected based on the previous study [16]. Sequences were aligned using MAFFT v.7.149 [70] and removed the gap or poorly aligned position using Gblocks v.0.91b, using default settings [71]. A maximum likelihood (ML) analysis was performed in IQ-TREE v. 1.4.2 [72] with 1000 replicates. By using jModelTest v.2.1.10 [73], GTR + I + G was selected as the optimal model based on the Bayesian information criterion. For the bayesian inference (BI) phylogenetic tree, the analysis was performed until the standard deviation of split frequencies was below 0.01 using MrBayes v3.1.2 [74]. Each chain was sampled every 100 generations. The first 25% of the sample was discarded as burn-in, and the rest was used to construct a consensus tree.

Divergence times of Korean endemic Saussurea species were estimated from the same dataset used for phylogenetic analysis using BEAST v2.6.2 [75] under lognormal relaxed clock. The GTR model was chosen to generate the tree. As a calibration point, the pairwise divergence time of 11.8 MYA for Carthamus and Centaurea was applied according to the data deposited in TIMETREE [76]. The Markov chain Monte Carlo chains were set to run for 16 million generations, sampling one every 2,500 generations. Tracer v.1.7 [77] was used for checking the convergence of the chains through adequate effective sample sizes (ESS). Finally, maximum clade credibility trees were calculated in TreeAnnotator v1.8.4 [78] and the summary trees with 95% highest posterior density (HPD) intervals of divergence time were visualized using Figtree v1.4.

Abbreviations

BI

Bayesian inference

cpDNA

Chloroplast DNA

DnaSP

DNA sequence polymorphism

ESS

Effective sample size

HPD

Highest posterior density

IR

Inverted repeat

Ka

Non–synonymous substitution rate

Ks

Synonymous substitution rate

LSC

Large single copy

ML

Maximum likelihood

MYA

Million years ago

Pi

Nucleotide diversity

Plastome

Plastid genome

RSCU

Relative synonymous codon usage

SSC

Small single copy

SSR

Simple sequence repeat

Declarations

Ethics approval and consent to participated

Not applicable.

Consent for publication

Not applicable.

Availability of data and materials

Sequence data that support the findings of this study have been deposited in GenBank (https://www.ncbi.nlm.nih.gov) with the accession codes of MT478053, MN509431, MT536932, MN530094, and MN530095. The sequences will be released after the article has been published.

Competing interests

The authors declare that they have no competing interests.

Funding

This project was supported in part by the Basic Science Research Program through the National Research Foundation of Korea (NRF) (NRF-2019R1A2C2009841).

Authors’ contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by SAY. The first draft of the manuscript was written by SAY 

and SCK revised it. All authors read and approved the final manuscript.

References

  1. Lipschitz S. Genus Saussurea DC. (Asteraceae). Nauka: Leningrad; 1979. p. 1–283.
  2. Butola JS, Samant SS. Saussurea species in Indian Himalayan Region: diversity, distribution and indigenous uses. Int J Plant Biol. 2010;1:e9. https://doi.org/10.4081/pb.2010.e9.
  3. Von Raab-Straube E. Phylogenetic relationships in Saussurea (Compositae, Cardueae) sensu lato, inferred from morphological, ITS and trnL-trnF sequence data, with a synopsis of Himalaiella gen. nov., Lipschitziella and Frolovia. Willdenowia. 2003;33:379–402.
  4. Kita Y, Fujikawa K, Ito M, Ohba H, Kato M. Molecular phylogenetic analyses and systematics of the genus Saussurea and related genera (Asteraceae, Cardueae). Taxon. 2004;53:679–90. https://doi.org/10.2307/4135443.
  5. Wang YJ, Liu JQ. Phylogenetic analyses of Saussurea sect. Pseudoeriocoryne (Asteraceae: Cardueae) based on chloroplast DNA trnL–F sequences. Biochem Syst Ecol. 2004;32:1009–23. https://doi.org/10.1016/j.bse.2004.04.005.
  6. Shi Z, Von Raab-Straube E. Cardueae. In: Wu ZY, Raven PH, Hong DY, editors. Flora of China. Beijing: Science Press & Missouri Botanical Garden Press; 2011. pp. 42–194.
  7. Wang YJ, Susanna A, Von Raab-Straube E, Milne R, Liu JQ. Island-like radiation of Saussurea (Asteraceae: Cardueae) triggered by uplifts of the Qinghai–Tibetan Plateau. Biol J Linn Soc. 2009;97:893–903. https://doi.org/10.1111/j.1095-8312.2009.01225.x.
  8. Wen J, Zhang J, Nie ZL, Zhong Y, Sun H. Evolutionary diversifications of plants on the Qinghai-Tibetan Plateau. Front Genet. 2014;5:4. https://doi.org/10.3389/fgene.2014.00004.
  9. Palmer JD, Jansen RK, Michaels HJ, Chase MW, Manhart JR. Chloroplast DNA variation and plant phylogeny. Ann Missouri Bot Gard. 1988;75:1180–206. https://doi.org/10.2307/2399279.
  10. Shaw J, Lickey EB, Beck JT, Farmer SB, Liu W, Miller J, Siripun KC, Winder CT, Schilling EE, Small RL. The tortoise and the hare II: Relative utility of 21 noncoding chloroplast DNA sequences for phylogenetic analysis. Am J Bot. 2005;92:142–66. https://doi.org/10.3732/ajb.92.1.142.
  11. Shaw J, Lickey EB, Schilling EE, Small RL. Comparison of whole chloroplast genome sequences to choose noncoding regions for phylogenetic studies in angiosperms: The tortoise and the hare III. Am J Bot. 2007;94:275–88. https://doi.org/10.3732/ajb.94.3.275.
  12. Zhang SD, Jin JJ, Chen SY, Chase MW, Soltis DE, Li HT, Yang JB, Li DZ, Yi TS. Diversification of Rosaceae since the Late Cretaceous based on plastid phylogenomics. New Phytol. 2017;214:1355–67. https://doi.org/10.1111/nph.14461.
  13. Gitzendanner MA, Soltis PS, Wong GKS, Ruhfel BR, Soltis DE. Plastid phylogenomic analysis of green plants: a billion years of evolutionary history. Am J Bot. 2018;105:291–301. https://doi.org/10.1002/ajb2.1048.
  14. Chen HF. Chloroplast Phylogenomics Reveals the Intercontinental Biogeographic History of the Liquorice Genus (Leguminosae: Glycyrrhiza). Front Plant Sci. 2020;11:793. https://doi.org/10.3389/fpls.2020.00793.
  15. Zhang X, Deng T, Moore MJ, Ji Y, Lin N, Zhang H, Meng A, Wang H, Sun Y, Sun H. Plastome phylogenomics of Saussurea (Asteraceae: Cardueae). BMC Plant Biol. 2019;19:290. https://doi.org/10.1186/s12870-019-1896-6.
  16. Xu LS, Herrando-Moraira S, Susanna A, Galbany-Casals M, Chen YS. Phylogeny, origin and dispersal of Saussurea (Asteraceae) based on chloroplast genome data. Mol Phylogenet Evol. 2019;141:106613. https://doi.org/10.1016/j.ympev.2019.106613.
  17. Im HT. Saussurea DC. In: Park CW, editor. The genera of vascular plants of Korea. Seoul: Academy Publishing Company; 2007. pp. 982–9.
  18. Cheon KS, Kim HJ, Han JS, Kim KA, Yoo KO. The complete chloroplast genome sequence of Saussurea chabyoungsanica (Asteraceae), an endemic to Korea. Conserv Genet Resour. 2017;9:51–3. https://doi.org/10.1007/s12686-016-0617-9.
  19. Yun SA, Gil HY, Kim SC. The complete chloroplast genome sequence of Saussurea polylepis (Asteraceae), a vulnerable endemic species of Korea. Mitochondrial DNA Part B Resour. 2017;2:650–1. https://doi.org/10.1080/23802359.2017.1375881.
  20. Sun EM, Yun SA, Kim SC, Chung GY, Nam MJ, Im HT. Saussurea albifolia MJ Nam & HT Im (Compositae), a new species from the Baekdudaegan Area, Korea. J Species Res. 2021;10:159–63. https://doi.org/10.12651/JSR.2021.10.2.159.
  21. Lee WT, Im HT. Saussurea grandicapitula W. Lee et HT Im (Compositae), a new species from the Taebaek mountains, Korea. Korean J Pl Taxon. 2007;37:387–93.
  22. Im HT, Hong HH, Choi CI. Saussurea chabyoungsanica Im (Compositae), a new species from Mt. Chabyoung-san, Korea. J Plant Biol. 1997;40:288–90. https://doi.org/10.1007/BF03030462.
  23. Iorizzo M, Senalik D, Szklarczyk M, Grzebelus D, Spooner D, Simon P. De novo assembly of the carrot mitochondrial genome using next generation sequencing of whole genomic DNA provides first evidence of DNA transfer into an angiosperm plastid genome. BMC Plant Biol. 2012;12:61. https://doi.org/10.1186/1471-2229-12-61.
  24. Straub SCK, Cronn RC, Edwards C, Fishbein M, Liston A. Horizontal transfer of DNA from the mitochondrial to the plastid genome and its subsequent evolution in milkweeds (Apocynaceae). Genome Biol Evol. 2013;5:1872–85. https://doi.org/10.1093/gbe/evt140.
  25. Ma PF, Zhang YX, Guo ZH, Li DZ. Evidence for horizontal transfer of mitochondrial DNA to the plastid genome in a bamboo genus. Sci Rep. 2015;5:11608. https://doi.org/10.1038/srep11608.
  26. Yang Z, Bielawski JP. Statistical methods for detecting molecular adaptation. Trends Ecol Evol. 2000;15:496–503. https://doi.org/10.1016/S0169-5347(00)01994-7.
  27. Daniell H, Lin CS, Yu M, Chang WJ. Chloroplast genomes: diversity, evolution, and applications in genetic engineering. Genome Biol. 2016;17:1–29. DOI 10.1186/s13059-016-1004-2.
  28. Cosner ME, Jansen RK, Palmer JD, Downie SR. The highly rearranged chloroplast genome of Trachelium caeruleum (Campanulaceae): multiple inversions, inverted repeat expansion and contraction, transposition, insertions/deletions, and several repeat families. Curr Genet. 1997;31:419–29. https://doi.org/10.1007/s002940050225.
  29. Plunkett GM, Downie SR. Expansion and contraction of the chloroplast inverted repeat in Apiaceae subfamily Apioideae. Syst Bot. 2000;25:648–67. https://doi.org/10.2307/2666726.
  30. Goulding SE, Olmstead RG, Morden CW, Wolfe KH. Ebb and flow of the chloroplast inverted repeat. Mol Gen Genet. 1996;252:195–206. https://doi.org/10.1007/BF02173220.
  31. Hansen DR, Dastidar SG, Cai Z, Penaflor C, Kuehl JV, Boore JL, Jansen RK. Phylogenetic and evolutionary implications of complete chloroplast genome sequences of four early-diverging angiosperms: Buxus (Buxaceae), Chloranthus (Chloranthaceae), Dioscorea (Dioscoreaceae), and Illicium (Schisandraceae). Mol Phylogenet Evol. 2007;45:547–63. https://doi.org/10.1016/j.ympev.2007.06.004.
  32. Huang JL, Sun GL, Zhang DM. Molecular evolution and phylogeny of the angiosperm ycf2 gene. J Syst Evol. 2010;48:240–8. https://doi.org/10.1111/j.1759-6831.2010.00080.x.
  33. Martin W, Stoebe B, Goremykin V, Hansmann S, Hasegawa M, Kowallik KV. Gene transfer to the nucleus and the evolution of chloroplasts. Nature. 1998;393:162–5. https://doi.org/10.1038/30234.
  34. Adams KL, Palmer JD. Evolution of mitochondrial gene content: gene loss and transfer to the nucleus. Mol Phylogenet Evol. 2003;29:380–95. https://doi.org/10.1016/S1055-7903(03)00194-5.
  35. Wang D, Wu YW, Shih ACC, Wu CS, Wang YN, Chaw SM. Transfer of chloroplast genomic DNA to mitochondrial genome occurred at least 300 MYA. Mol Biol Evol. 2007;24:2040–8. https://doi.org/10.1093/molbev/msm133.
  36. Park I, Yang S, Kim WJ, Song JH, Lee HS, Lee HO, Lee JH, Ahn SN, Moon BC. Sequencing and comparative analysis of the chloroplast genome of Angelica polymorpha and the development of a novel indel marker for species identification. Molecules. 2019;24:1038. https://doi.org/10.3390/molecules24061038.
  37. Shi H, Yang M, Mo C, Xie W, Liu C, Wu B, Ma X. (2019). Complete chloroplast genomes of two Siraitia Merrill species: Comparative analysis, positive selection and novel molecular marker development. PloS One. 2019;14:e0226865. https://doi.org/10.1371/journal.pone.0226865.
  38. Cho MS, Yang JY, Yang TJ, Kim SC. Evolutionary Comparison of the Chloroplast Genome in the Woody Sonchus Alliance (Asteraceae) on the Canary Islands. Genes. 2019;10:217. https://doi.org/10.3390/genes10030217.
  39. Li X, Tan W, Sun J, Du J, Zheng C, Tian X, Zeng M, Xiang B, Wang Y. Comparison of Four Complete Chloroplast Genomes of Medicinal and Ornamental Meconopsis Species: Genome Organization and Species Discrimination. Sci Rep. 2019;9:10567. https://doi.org/10.1038/s41598-019-47008-8.
  40. Shaw J, Lickey EB, Beck JT, Farmer SB, Liu W, Miller J, Siripun KC, Winder CT, Schilling EE, Small RL. The tortoise and the hare II: Relative utility of 21 noncoding chloroplast DNA sequences for phylogenetic analysis. Am J Bot. 2005;92:142–66. https://doi.org/10.3732/ajb.92.1.142.
  41. Shaw J, Lickey EB, Schilling EE, Small RL. Comparison of whole chloroplast genome sequences to choose noncoding regions for phylogenetic studies in angiosperms: The tortoise and the hare III. Am J Bot. 2007;94:275–88. https://doi.org/10.3732/ajb.94.3.275.
  42. López-Vinyallonga S, Mehregan I, Garcia-Jacas N, Tscherneva O, Susanna A, Kadereit JW. Phylogeny and evolution of the Arctium-Cousinia complex (Compositae, Cardueae‐Carduinae). Taxon. 2009;58:153–71. https://doi.org/10.1002/tax.581016.
  43. Demaio PH, Barfuss MHJ, Kiesling R, Till W, Chiapella JO. Molecular phylogeny of Gymnocalycium (Cactaceae): Assessment of alternative infrageneric systems, a new subgenus, and trends in the evolution of the genus. Am J Bot. 2011;98:1841–54. https://doi.org/10.3732/ajb.1100054.
  44. Javadi F, Tun YT, Kawase M, Guan K, Yamaguchi H. Molecular phylogeny of the subgenus Ceratotropis (genus Vigna, Leguminosae) reveals three eco-geographical groups and Late Pliocene-Pleistocene diversification: evidence from four plastid DNA region sequences. Ann Bot. 2011;108:367–80. https://doi.org/10.1093/aob/mcr141.
  45. Michelangeli FA, Guimaraes PJF, Penneys DS, Almeda F, Kriebel R. Phylogenetic relationships and distribution of new world Melastomeae (Melastomataceae). Bot J Linn Soc. 2013;171:38–60. https://doi.org/10.1111/j.1095-8339.2012.01295.x.
  46. Yazbek M, Oh SH. Peaches and almonds: phylogeny of Prunus subg. Amygdalus (Rosaceae) based on DNA sequences and morphology. Plant Syst Evol. 2013;299:1403–18. https://doi.org/10.1007/s00606-013-0802-1.
  47. Powell W, Machray GC, Provan J. Polymorphism revealed by simple sequence repeats. Trends Plant Sci. 1996;1:215–22. https://doi.org/10.1016/1360-1385(96)86898-1.
  48. Vendramin GG, Degen B, Petit RJ, Anzidei M, Madaghiele A, Ziegenhagen B. High level of variation at Abies alba chloroplast microsatellite loci in Europe. Mol Ecol. 1999;8:1117–26. https://doi.org/10.1046/j.1365-294x.1999.00666.x.
  49. Cubas P, Pardo C, Tahiri H. Genetic variation and relationships among Ulex (Fabaceae) species in southern Spain and northern Morocco assessed by chloroplast microsatellite (cpSSR) markers. Am J Bot. 2005;92:2031–43. https://doi.org/10.3732/ajb.92.12.2031.
  50. Wei F, Tang D, Wei K, Qin F, Li L, Lin Y, Zhu Y, Khan A, Kashif MH, Miao J. The complete chloroplast genome sequence of the medicinal plant Sophora tonkinensis. Sci Rep. 2020;10:1–13. https://doi.org/10.1038/s41598-020-69549-z.
  51. Cui Y, Zhou J, Chen X, Xu Z, Wang Y, Sun W, Song J, Yao H. Complete chloroplast genome and herbaria comparative analysis of three Lycium (Solanaceae) species with medicinal and edible properties. Gene Rep. 2019;17:100464. https://doi.org/10.1016/j.genrep.2019.100464.
  52. Gao K, Li J, Khan WU, Zhao T, Yang X, Yang X, Guo B, An X. Comparative genomic and phylogenetic analyses of Populus section Leuce using complete chloroplast genome sequences. Tree Genet Genomes. 2019;15:32. https://doi.org/10.1007/s11295-019-1342-9.
  53. Yun SA, Kim SC. Microsatellite markers for Saussurea polylepis (Asteraceae), a vulnerable continental island species endemic to Korea. Appl Plant Sci. 2019;7:e11270. https://doi.org/10.1002/aps3.11270.
  54. Guisinger MM, Kuehl JV, Boore JL, Jansen RK. Genome-wide analyses of Geraniaceae plastid DNA reveal unprecedented patterns of increased nucleotide substitutions. Proc Natl Acad Sci U S A. 2008;105:18424–29. https://doi.org/10.1073/pnas.0806759105.
  55. Xu C, Dong J, Tong C, Gong X, Wen Q, Zhuge Q. Analysis of synonymous codon usage patterns in seven different Citrus species. Evol Bioinform. 2013;9:215–28. https://doi.org/10.4137/EBO.S11930.
  56. Ingvarsson PK. Molecular evolution of synonymous codon usage in Populus. BMC Evol Biol. 2008;8:307. https://doi.org/10.1186/1471-2148-8-307.
  57. Yu X, Zuo L, Lu D, Lu B, Yang M, Wang J. Comparative analysis of chloroplast genomes of five Robinia species: Genome comparative and evolution analysis. Gene. 2019;689:141–51. https://doi.org/10.1016/j.gene.2018.12.023.
  58. Hewitt GM. Some genetic consequences of ice ages and their role in divergence and speciation. Biol J Linn Soc. 1996;58:247–76. https://doi.org/10.1111/j.1095-8312.1996.tb01434.x.
  59. Hewitt GM. The genetic legacy of the Quaternary ice ages. Nature. 2000;405:907–13. https://doi.org/10.1038/35016000.
  60. Schmitt T. Molecular biogeography of Europe: Pleistocene cycles and postglacial trends. Front Zool. 2007;4:11. https://doi.org/10.1186/1742-9994-4-11.
  61. Zerbino DR, Birney E. Velvet. Algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 2008;18:821–9. https://doi.org/10.1101/gr.074492.107.
  62. Wyman SK, Jansen RK, Boore JL. Automatic annotation of organellar genomes with DOGMA. Bioinformatics. 2004;20:3252–55. https://doi.org/10.1093/bioinformatics/bth352.
  63. Lowe TM, Chan PP. tRNAscan-SE on-line: Integrating search and context for analysis of transfer RNA genes. Nucleic Acids Res. 2016;44:W54–7. https://doi.org/10.1093/nar/gkw413.
  64. Lohse M, Drechsel O, Bock R. Organellar genome DRAW(OGDRAW): A tool for the easy generation of high-quality custom graphical maps of plastid and mitochondrial genomes. Curr Genet. 2009;25:1451–2. https://doi.org/10.1007/s00294-007-0161-y.
  65. Frazer KA, Pachter L, Poliakov A, Rubin EM, Dubchak I. VISTA: Computational tools for comparative genomics. Nucleic Acids Res. 2014;32:W273–9. https://doi.org/10.1093/nar/gkh458.
  66. Brudno M, Malde S, Poliakov A, Do CB, Couronne O, Dubchak I, Batzoglou S. Glocal alignment: Finding rearrangements during alignment. Bioinformatics. 2003;19:i54–62. https://doi.org/10.1093/bioinformatics/btg1005.
  67. Rozas J, Ferrer-Mata A, Sánchez-DelBarrio JC, Guirao-Rico S, Librado P, Ramos-Onsins SE, Sánchez-Gracia A. DnaSP v6: DNA sequence polymorphism analysis of large datasets. Mol Biol Evol. 2017;34:3299–302. https://doi.org/10.1093/molbev/msx248.
  68. Thiel T, Michalek W, Varshney RK, Graner A. Exploiting EST databases for the development and characterization of gene-derived SSR-markers in barley (Hordeum vulgare L.). Theor Appl Genet. 2003;106:411–22. https://doi.org/10.1007/s00122-002-1031-0.
  69. Wang D, Zhang Y, Zhang Z, Zhu J, Yu J. KaKs_Calculator 2.0: a toolkit incorporating gamma-series methods and sliding window strategies. Genomics Proteom Bioinf. 2010;8:77–80. https://doi.org/10.1016/S1672-0229(10)60008-3.
  70. Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 2013;30:772–80. https://doi.org/10.1093/molbev/mst010.
  71. Castresana J. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol Biol Evol. 2000;17:540–52. https://doi.org/10.1093/oxfordjournals.molbev.a026334.
  72. Nguyen LT, Schmidt HA, Von Haeseler A, Minh BQ. IQ-TREE: A fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol. 2015;32:268–74. https://doi.org/10.1093/molbev/msu300.
  73. Darriba D, Taboada GL, Doallo R, Posada D. jModelTest 2: more models, new heuristics and parallel computing. Nat Methods. 2012;9:772. https://doi.org/10.1038/nmeth.2109.
  74. Ronquist F, Huelsenbeck JP. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics. 2003;19:1572–4. DOI:10.1093/bioinformatics/btg180.
  75. Bouckaert R, Heled J, Kühnert D, Vaughan T, Wu CH, Xie D, Suchard MA, Rambaut A, Drummond AJ. BEAST 2: A software platform for Bayesian evolutionary analysis. PLoS Comput Biol. 2004;10:e1003537. https://doi.org/10.1371/journal.pcbi.1003537.
  76. Hedges SB, Dudley J, Kumar S. TimeTree: a public knowledge-base of divergence times among organisms. Bioinformatics. 2006;22:2971–2. https://doi.org/10.1093/bioinformatics/btl505.
  77. Rambaut A, Drummond AJ, Xie D, Baele G, Suchard MA. Posterior summarization in Bayesian phylogenetics using Tracer 1.7. Syst Biol. 2018;67:901–4. https://doi.org/10.1093/sysbio/syy032.
  78. Drummond A, Suchard MA, Xie D, Rambaut A. Bayesian Phylogenetics with Beauti and the Beast 1.7. Mol Biol Evol. 2012;29:1969–73. https://doi.org/10.1093/molbev/mss075.