Plastid genome data provide new insights into the phylogeny and evolution of the Subtribe Swertiinae

DOI: https://doi.org/10.21203/rs.3.rs-2403178/v1

Abstract

Background

Subtribe Swertiinae, belonging to Gentianaceae, is one of the most taxonomically difficult representatives. The intergeneric and infrageneric classification and phylogenetic relationships within Subtribe Swertiinae are controversial and unresolved.

Methods

With the aim of clarifying the circumscription of taxa within the Subtribe Swertiinae, comparative and phylogenetic analyses were conducted using 34 Subtribe Swertiinae chloroplast genomes (4 newly sequenced) representing 9 genera.

Results

The results showed that 34 chloroplast genomes of Subtribe Swertiinae were smaller and ranged in size from 149,036 to 154,365 bp, each comprising two inverted repeat regions (size range 25,069 − 26,126 bp) that separated large single-copy (80,432 − 84,153 bp) and small single-copy (17,887 − 18,47 bp) regions, and all chloroplast genomes showed similar gene order, content, and structure. These chloroplast genomes contained 129–134 genes each, including 84–89 protein-coding genes, 30 tRNAs, and 4 rRNAs. The chloroplast genomes of Subtribe Swertiinae appeared to lose some genes, such as the rpl33, rpl2 and ycf15 genes. Nineteen hypervariable regions, including trnC-GCA-petN, trnS-GCU-trnR-UCU, ndhC-trnV-UAC, trnC-GCA-petN, psbM-trnD-GUC, trnG-GCC-trnfM-CAU, trnS-GGA-rps4, ndhC-trnV-UAC, accD-psaI, psbH-petB, rpl36-infA, rps15-ycf1, ycf3, petD, ndhF, petL, rpl20, rpl15 and ycf1, were screened, and 36–63 SSRs were identified as potential molecular markers. Positive selection analyses showed that two genes (ccsA and psbB) were proven to have high Ka/Ks ratios, indicating that chloroplast genes may have undergone positive selection in evolutionary history. Phylogenetic analysis showed that 34 Subtribe Swertiinae species formed a monophyletic clade including two evident subbranches, and Swertia was paraphyly with other related genera, which were distributed in different clades.

Conclusion

These results provide valuable information to elucidate the phylogeny, divergence time and evolution process of Subtribe Swertiinae.

Introduction

Subtribe Swertiinae belongs to Gentianaceae, with approximately 539–565 species, and is widely distributed in alpine, temperate and alpine regions around the world but rarely in tropical and subtropical regions at low latitudes. East Asia and North America are the centers of diversification of this subtribe, with 137 species of 11 genera in China [1]. Many species of Subtribe Swertiinae, such as Halenia elliptica, Comastoma pedunculatum, Gentianopsis paludosa, Lomatogonium carinthiacum, Swertia mussotii and S. franchetiana, are the original plants of Tibetan medicine “Dida” (Zangyinchen). "Dida" is one of the most representative common medicinal materials in Tibetan medicine and has various effects, such as clearing the liver and gallbladder, diuresis, strengthening muscles and bones, and hemostasis. Clinically, it is widely used in the treatment of acute jaundice hepatitis, viral hepatitis, cholecystitis, urinary tract infection, blood disease, fall injury, dysentery, edema, influenza and other diseases. According to preliminary statistics, approximately 15% of Tibetan medicine prescription compatibility uses "Dida", such as 25 flavours of coral pill, 25 flavours of Swertia pill, Ganlu ling pill, and so on. Meanwhile, it took “Dida” as the main drug or compatibility use in the Tibetan Traditional Medicine that developed in modern times, such as Zangyin Chen tablet (capsule), Gantaishu capsule, Zangjiangzhi capsule and fluan pill. Therefore, increasing attention has been given to the plants of Subtribe Swertiinae due to their extensive pharmacological effects. However, the relationships within Subtribe Swertiinae remain poorly understood, especially between genera [25]. Struwe et al. (2002) [1]divided Subtribe Swertiinae into 14 genera based on morphological characters, which were accepted by later researchers[3, 6]. Subsequently, Ho and Liu (2015) [7] added two newly published genera, Lomatogoniopsis and Sinoswertia, to Subtribe Swertiinae. Therefore, Subtribe Swertiinae contains 16 genera, of which 13 are native to China, including three Chinese endemic genera. Several recent phylogenetic studies have tried but failed to resolve the relationship between 16 genera in Subtribe Swertiinae[4, 5, 8]. Moreover, current taxonomic hypotheses with regard to the relationships within and between genera of Subtribe Swertiinae rely on morphological characters and fewer fragments of chloroplast DNA (cpDNA) sequences [45]. Therefore, additional molecular markers are needed for phylogenetic analysis to resolve the interspecific relationships and evolutionary history of Subtribe Swertiinae.

Because the chloroplast genome is the second largest genome after the nuclear genome and the nucleotide substitution rates of chloroplasts are moderate, the chloroplast genome of plants has a significant advantage in phylogenetic studies of higher-order elements of species and other species [913]. In addition, comparative analysis of chloroplast genomes provides essential insights into the organization and evolutionary history of taxonomically related species [1417]. Herein, we conducted comparative analyses of the chloroplast genome for 34 selected Subtribe Swertiinae species representing 9 genera for which complete chloroplast sequences were available (Table 1). The study objectives were to (1) identify the structure and characteristics of chloroplast genomes among Subtribe Swertiinae; (2) explore the intergeneric and interspecific relationships of Subtribe Swertiinae; and (3) estimate genes that are potentially under positive selection, negative selection or neutral evolution and that could be targeted for evolutionary studies in Gentianaceae.

Table 1

The complete genome features of 33 species of 9 genus in Subtribe Swertiinae

Species

All length

(bp)

GC

(%)

LSC Length (bp)

GC

(%)

SSC Length

(bp)

GC

(%)

IR Length

(bp)

GC

(%)

GenBank accession numbers

Gene number

tRNA gene number

rRNA gene number

Protein-coding gene

Comastoma falcatum

151,423

38.26

81,721

36.34

18,248

31.78

25,727

43.59

MK331815

132

37

8

87

Comastoma pulmonarium

151,595

38.25

81,919

36.30

18,280

31.79

25,698

43.69

MW324577

130

37

8

85

Gentianopsis barbata

151,123

37.85

82,690

35.80

17,887

31.77

25,273

43.34

MZ579704

131

37

8

86

Gentianopsis grandis

151,271

37.87

82,572

35.81

17,907

31.76

25,396

43.27

NC_049879

134

37

8

89

Gentianopsis paludosa

151,568

37.84

82,834

35.76

17,928

31.77

25,403

43.35

MT921831

129

37

8

84

Lomatogoniopsis alpina

150,986

38.13

81,302

36.22

18,180

31.35

25,752

43.54

NC_050658

131

37

8

86

Lomatogonium perenne

151,678

38.16

81,979

36.28

18,237

31.46

25,731

43.52

NC_050659

131

37

8

86

Pterygocalyx volubilis

154,365

37.87

84,033

35.87

18,476

31.65

25,928

43.34

NC_056992

131

37

8

86

Veratrilla baillonii

151,962

38.24

82,475

36.35

17,983

30.39

25,752

43.44

MW872006

132

37

8

87

Halenia coreana

153,198

38.22

83,252

36.36

18,372

32.16

25,787

43.39

MK606372

134

37

8

89

Halenia elliptica

153,305

38.15

82,767

36.26

18,286

32.02

26,126

43.29

NC_050657

133

37

8

88

Swertia bifolia

153,242

38.06

83,496

36.16

18,200

31.89

25,773

43.33

SUB11740174

133

37

8

88

Swertia bimaculata

153,751

38.03

84,156

36.02

18,089

32.07

25,753

43.39

MW344296

134

37

8

89

Swertia cincta

149,089

38.20

80,481

36.34

17,946

31.79

25,331

43.42

MZ261898

133

37

8

88

Swertia cordata

153,429

38.05

83,612

36.16

18,037

31.75

25,890

43.3

NC_054359

133

37

8

88

Swertia dichotoma

152,977

37.50

83,044

35.55

18,303

31.25

25,815

43.02

MZ261899.1

132

37

8

87

Swertia dilatata

150,057

38.17

81,310

36.28

17,887

31.79

25,430

43.42

MW344298

132

37

8

87

Swertia diluta

153,691

38.10

83,859

36.20

18,300

31.9

25,766

43.5

NC_057681.1

134

37

8

89

Swertia erythrosticta

153,039

38.10

83,372

36.18

18,249

31.89

25,709

43.33

MW344299

133

37

8

88

Swertia franchetiana

153,428

38.20

83,564

34.66

18,342

33.22

25, 749

43.28

NC_056357

133

37

8

88

Swertia hispidicalyx

149,488

38.19

80,727

36.30

17,903

31.81

25,429

43.42

NC_044474

133

37

8

88

Swertia kouitchensis

153,475

38.15

83,595

36.23

18,348

31.93

25,766

43.47

MZ261902

133

37

8

88

Swertia leducii

153,015

38.17

83,048

36.35

18,395

31.90

25,785

43.44

NC_045301

134

37

8

89

Swertia macrosperma

152,737

38.22

83,046

36.31

18,231

31.99

25,730

43.50

MZ261903

133

37

8

88

Swertia multicaulis

152,190

38.10

82,893

36.25

18,343

31.82

25,477

43.35

NC_050660

131

37

8

86

Swertia mussotii

153,499

38.16

83,591

36.23

18,336

31.95

25,761

43.50

KU641021

134

37

8

89

Swertia nervosa

153,690

38.12

83,864

36.25

18,254

31.82

25,786

43.37

NC_057596

131

37

8

86

Swertia przewalskii

151,079

38.1

81,780

33.22

18,193

33.66

25,553

42.16

ON017794

133

37

8

88

Swertia pubescens

149,036

38.19

80,432

36.33

17,936

31.81

25,334

43.42

MZ261905

133

37

8

88

Swertia punicea

153,448

38.15

83,535

36.25

18,345

31.88

25,784

43.47

MZ261896

133

37

8

88

Swertia souliei

152,804

38.08

83,195

36.17

18,105

31.89

25,752

43.33

NC_052874

134

37

8

89

Swertia tetraptera

152,787

38.1

83,177

32.18

18,305

32.18

25,679

44.38

ON164641

134

37

8

89

Swertia verticillifolia

151,682

38.14

82,623

36.26

18,335

31.83

25,362

43.48

MF795137

134

37

8

89

Swertia wolfgangiana

153,225

38.06

83,528

36.17

18,219

31.88

25,739

43.34

MW344307

134

37

8

89

Materials And Methods

Sampling and DNA Extraction

We collected fresh young leaves of S. tetraptera, S. franchetian, S. przewalskii and S.

bifolia from Mengyuan County of Qinghai Province (101.32′E, 37.62′N, 3,208 m), Huangzhong County of Qinghai Province (101.63′E, 36.57′N, 2,510 m), Qilian County of Qinghai Province (99.61′E, 38.83′N, 3,234 m), and Qilian County of Qinghai Province (102.22′E, 37.45′N, 3,135 m), respectively. We used silica gel to rapidly store the leaves until dried. Voucher specimens of these four species were deposited in the Qinghai-Tibetan Plateau Museum of Biology (QTPMB) with voucher numbers QHGC-2011, QHGC20190821, QHGC-2013, and QHGC-2014, respectively.

DNA extraction, library preparation and genome sequencing

The total genomic DNA of four Swertia L. plants was extracted from dried leaves using an improved CTAB method [18] and estimated for purity and concentration using a NanoDrop 2000 microspectrophotometer. Each genomic DNA sample was broken into fragments of different lengths by ultrasound. Then, the DNA fragments were purified, the end was repaired, the 3' end was added with an A tail, and the sequencing joint was connected. After that, agarose gel electrophoresis was used to select suitably sized DNA fragments, and PCR amplification was performed to complete the preparation of the sequencing library. After qualified library quality inspection, the Illumina HiSeq platform (Beijing Biomarker Technologies Co., Ltd.) was used for 150 bp paired-end sequencing.

Chloroplast genome assembly and annotation

Raw sequencing data were transformed into sequenced reads (raw data) by performing a base calling analysis of the raw image files. SQCToolkit_v2.3.3 software [19] was used to filter the raw read data obtained by sequencing to remove low-quality regions and obtain clean reads. The results were then stored in the FASTQ format. We used the iterative organelle genome assembly pip to assemble the chloroplast genome with S. mussotii (NC_031155) serving as a reference [20]. Then, SPAdes v3.6.1 software was employed for ab novo splicing under default parameters and to generate a series of contigs [21]. Contigs larger than 1,000 bp were used for chloroplast genome assembly. Complete chloroplast genome sequences were constructed by matching and linking contigs [22] and filling the gaps after assembly using second-generation sequencing technology.

The chloroplast genomes of four Swertia L. species were annotated using the online program Geseq [23] and PGA software [24]. We compared annotations from the two methods and made final adjustments with manual in Geneious version 11.0.2 [22]. Then, we checked the initial annotation, putative starts, stops, and intron positions by comparison with homologous genes in the same genus species S. mussotii. Then, we used OGDRAW [25] software to draw circular plastid genome maps of the four Swertia L. species. Finally, the sequence data and gene annotation information of the four Swertia L. species were uploaded to the NCBI database with accession numbers NC_056357 (S. franchetiana), ON164641 (S. tetraptera), ON017794 (S. przewalskii), and SUB11740174 (S. bifolia).

Single Sequence Repeat (SSR) and Relative Synonymous Codon Usage (RSCU) Analysis

We used the online MISA program [26] to detect SSRs in the chloroplast genomes of 34 species in Subtribe Swertiinae using the following parameters: mononucleotide unit repetition number ≥ 10; dinucleotide unit repetition number ≥ 5; trinucleotide unit repetition number ≥ 4; and tetranucleotide, pentanucleotide, and hexanucleotide unit repetition number ≥ 3 (Beier et al. 2017). CodonW1.4.2 software was also employed to confirm the amino acid usage frequency and relative synonymous codon usage (RSCU) [27].

Complete Chloroplast Genome of Comparison Analysis

We used IRscope software to visually analyze boundaries among the four main chloroplast regions (LSC/IRb/SSC/IRa) of 34 species in Subtribe Swertiinae [28]. Moreover, mauve software was used to analyze the chloroplast DNA rearrangement of the 34 species in Subtribe Swertiinae. Meanwhile, the online software mVISTA was used to compare the 34 species of Subtribe Swertiinae with the shuffle-LAGAN Mode [29]. Veratrilla baillonii was used as a reference genome. The method developed by Zhang et al. (2011) [30] was used to calculate the percentages of variable characters in the coding and noncoding regions of chloroplast genomes.

Analysis of Synonymous (Ks) and Non-Synonymous (Ka) Substitution Rate

We computed the selective pressures for protein-encoding genes that were located in three regions of chloroplast genomes (LSC, SSC and one IR). Protein-encoding genes that were shared by 34 species were chosen and extracted from complete chloroplast genomes for synonymous (Ks) and nonsynonymous (Ka) substitution rate analysis. Each gene selection was forecast by taking into account the ratios of Ka/Ks, that is, Ka/Ks < 1 purifying selection, Ka/Ks = 1 neutral selection, and Ka/Ks > 1 positive selection [31]. Nonsynonymous (Ka) and synonymous (Ks) substitution rates were calculated using KaKs_Calculator 2.0 software [32] with the following settings: genetic code table 11 (bacterial and plant plastid code); method of calculation: NG.

Phylogenetic Analysis

To examine the phylogenetic relationship of 34 species of 9 genera within Subtribe Swertiinae, an evolutionary tree was constructed using G. straminea (KJ657732), Gardneria ovata (NC_065470) and Amalocalyx microlobus (NC_067035) as outgroups. Meanwhile, we used 80 shared protein-coding genes of 34 chloroplast genomes to construct a molecular phylogenetic tree. All chloroplast genome sequences and shared protein-coding gene sequences were aligned with MAFFT (version 7) [33], and phylogenetic analyses were performed according to the Bayesian inference (BI) method under the best-fit substitution model GTR + I + G selected by AIC in MrModeltest 2.3 [34] using MrBayes v3.2.1 [35]. BI analysis was run independently using four Markov Chain Monte Carlo (MCMC) chains, that is, three heated chains and one cold chain, and started with a random tree; each chain was run for 2×107 generations, sampled every 2 000 generations, and discarded the first 25% preheated (Burn-in) trees. We estimated the convergence of data runs using an average standard deviation of split frequencies (ASDSF) < 0.01 and Tracer v1.7.1[36] to check for an effective sample size (ESS) > 200. The phylogenetic tree nodes were considered well-supported when the Bayesian posterior probability (BP) of the node was ≥ 0.95.

Results

Structural features of Subtribe Swertiinae chloroplast genomes

In this study, we analyzed the chloroplast genome features and gene contents of 34 species in 9 genera from Subtribe Swertiinae (Table 1 and Table S1). All 34 chloroplast genomes of Subtribe Swertiinae demonstrated a typical quadripartite structure that was similar to the majority of angiosperm chloroplast genomes (Fig. 1). The length of the chloroplast genome of 34 species in 9 genera of Subtribe Swertiinae varied between genera and species. The chloroplast genome length of 34 species of 9 genera from Subtribe Swertiinae ranged from 149,036 (S. pubescens) to 154,365 bp (Pterygocalyx volubilis), with an average length of 152,274 bp (Table 1). The longest chloroplast genome (154,365 bp) differed from other chloroplast genomes in Subtribe Swertiinae by 0.614–5.329 kb. All complete chloroplast genomes were made up of four parts, containing an LSC region (80,432 − 84,153 bp), an SSC region (17,887 − 18,476 bp), and two IR regions (25,069 − 26,126 bp). The GC content of the 34 species was very similar in both the whole chloroplast genome (37.5%-38.26%) and the corresponding regions (LSC [32.18%-36.36%], SSC [30.39%-33.66%], and IR [42.16%-43.38%]), with the IR regions having the highest GC contents (Table 1).

The chloroplast genome gene contents of 34 species in 9 genera from Subtribe Swertiinae showed a slight change. The chloroplast genome gene contents of 34 species in 9 genera from Subtribe Swertiinae ranged from 129 (Gentianopsis paludosa) to 134 (G. grandis, H. coreana, S. bimaculate, S. diluta, S. leducii, S. mussotii, S. souliei, S. tetraptera, S. verticillifolia and S. wolfgangiana) (Table 1). Accordingly, the number of protein-coding genes also varied, ranging from 84 to 89. However, the number of tRNA genes (37) and rRNA genes were relatively conserved among species (Table S1). Among these protein-coding genes, four pseudogenes (rps16, infA, ycf1 and rps19 genes) were found. Except for the lack of the rpl33 gene in the chloroplast genomes of S. dilatate, S. hispidicalyx, P. volubilis and C. pulmonarium, the rpl2 gene in the chloroplast genome of C. falcatum and the ycf15 gene in the chloroplast genome of G. paludosa, gene content differences were caused by four pseudogenes. For example, due to the lack of rps16, ycf1 and rps19 pseudogenes, the chloroplast genome of Lomatogoniopsis alpina contained 131 genes (Table S1). Among all the genes, 18 genes (trnK-UUU、rps16、trnG-UCC、atpF、rpoC1、ycf3、trnL-UAA、trnV-UAC、rps12、clpP、petB、petD、rpl16、rpl2、ndhB、trnI-GAU、trnA-UGC、ndhA) in H. elliptica, Veratrilla baillonii and S. punicea contained only one intron, while 17 genes (rps16 gene was absent or does not contain intron) in remaining 31 species of Subtribe Swertiinae contained one intron. Two protein-coding genes (ycf3 and clpP) in all 34 species chloroplast genomes contained two introns (Table S2).

The functions of major genes in the chloroplast genome of Subtribe Swertiinae could be roughly divided into three categories (Table 2): photosynthesis-related genes, chloroplast self-replication-related genes and other genes. Genes associated with photosynthesis and self-replication made up the majority of the chloroplast genome.

Table 2

Gene composition of chloroplast genome of 33 species of 8 genus in Subtribe Swertiinae.

Categroy

Group of genes

Name of genes

Photosynthesis

Photosystem I

psaA, psaB, psaC, psaI, psaJ

Photosystem II

psbA, psbB, psbC, psbD, psbE, psbF,psbH, psbI, psbJ, psbK, psbL, psbM,psbN, psbT, psbZ

NADH dehydrogenase

ndhA*, ndhB*, ndhC, ndhD, ndhE, ndhF,ndhG, ndhH, ndhI, ndhJ,ndhK

Cytochrome b/f complex

petA, petB*, petD*, petG, petL, petN

ATP synthase

atpA, atpB, atpE, atpF*, atpH, atpI

Self-replication

Ribosomal proteins (SSU)

rps2, rps3, rps4, rps7, rps8, rps11, rps12#, rps14, rps15, rps16*, rps18, rps19

Ribosomal proteins (LSU)

rpl2*, rpl14, rpl16*, rpl20, rpl22, rpl23, rpl32, rpl33, rpl36

Ribosomal RNAs

rrn4.51, rrn51, rrn161, rrn231

Transfer RNAs

tRNA-Lys*,tRNA-Gln,tRNA-Ser,tRNA-Gly*,tRNA-Arg,tRNA-Cys,tRNA-Asp,tRNA-Tyr,tRNA-Glu, tRNA-Thr,tRNA-Ser,tRNA-Gly,tRNA-Met,tRNA-Ser,tRNA-Thr,tRNA-Leu,tRNA-Phe,tRNA-Val, tRNA-Gly,tRNA-Met,tRNA-Trp,tRNA-Pro,tRNA-Ile,tRNA-Leu*,tRNA-Val*,tRNA-His, tRNA-Ile*1, tRNA-Ala*1,tRNA-Arg1,tRNA-Asn1,tRNA-Leu,tRNA-Asn,tRNA-Arg,tRNA-Ala,tRNA-Ile,tRNA-His

DNA-dependent RNA polymerase

rpoA, rpoB, rpoC1*, rpoC2

Other genes

Maturase

matK

Protease

clpP**

Envelope membrane protein

cemA

Subunit acetyl-CoA-carboxylase

accd

c-Type cytochrome synthesis gene

ccsA

Genes of unkown function

Conserved open reading frames

ycf1, 2a, 3**, 4, 15

Note: * represents a gene with one intron, ** represents a gene with two introns, # represents trans-splice gene

SSR and Codon usage analysis

The number of SSRs identified in 34 Subtribe Swertiinae chloroplast genomes ranged from 36 (S. bifolia and S. erythrosticta) to 63 (S. cordata) (Fig. 2). Six types of repeat patterns were found in SSRs, the numbers and types of which were different in 34 species chloroplast genomes in Subtribe Swertiinae. Among the mononucleotide repeats, A/T was dominant (50-82.22%), while C/G was rare (0-10.53%). Dinucleotides (1.89–11.63%), trinucleotides (4.35–19.44%) and pentanucleotides (3.92-20.00%) were found in all samples. Tetranucleotides and hexanucleotides were identified in eighteen and nine samples, respectively (Fig. 3 and Table S3).

Codon usage frequency for 34 Subtribe Swertiinae chloroplast genomes was detected based on the sequences of protein-coding genes (CDS). The number of codons of protein-coding genes in the 34 chloroplast genomes of Subtribe Swertiinae ranged from 20531 (S. tetraptera) to 26402 (H. elliptica). In all species, serine (Ser; 1075–2268 instances) was the most abundant amino acid encoded by four codons, followed by arginine (Arg; 1137–2244 instances), encoded by six codons (Table S4). In contrast, methionine and tryptophan were encoded by only one codon, with instances ranging from 219 to 610 and from 387–605, respectively, and showed no codon-biased usage (RSCU = 1). The AGA codon in arginine had the largest RSCU values (1.70–2.11), and the CUG codon in leucine had the smallest RSCU values (0.31–0.80) in 34 species chloroplast genomes. A total of 26 codons with RSCU values greater than one were identified within the 64 codons in 34 species chloroplast genomes. Twenty-three of the 26 codons with RSCU values greater than one ended with A or U, which showed the codon preferences in 34 species chloroplast genomes (Fig. 3, Table S4).

Comparative genome analysis

We used the online procedure mVISTA to identify the potential divergence sequences among the 34 Subtribe Swertiinae chloroplast genomes, with the chloroplast genome of V. baillonii as a reference. The structures and sequences of Subtribe Swertiinae chloroplast genomes were conserved, especially in the IR regions (Fig. 4). Meanwhile, we used DNASP software to calculate the variation rate of coding and noncoding regions. The results demonstrated that the variation rates of noncoding regions were generally higher than those of coding regions (Fig. 5). The variation in noncoding region genes ranged from 11.11–99.28%, with an average of 63.98%, whereas the variation in coding region genes ranged from 5.78–88.97%, with an average of 25.39%. Both the variation rates of coding regions and noncoding regions in the IR region were lower than those in other regions. Additionally, the noncoding intergenic regions were highly divergent, especially trnC-GCA-petN, trnS-GCU-trnR-UCU, ndhC-trnV-UAC, trnC-GCA-petN, psbM-trnD-GUC, trnG-GCC-trnfM-CAU, trnS-GGA-rps4, ndhC-trnV-UAC, accD-psaI, psbH-petB, rpl36-infA and rps15-ycf1. However, highly divergent regions were also found within protein-coding regions, such as in ycf3, petD, ndhF, petL, rpl20, rpl15 and ycf1. In addition, there were no genomic rearrangements in the alignment analysis of 34 Subtribe Swertiinae chloroplast genomes.

Gene Selective Pressure Analysis

We calculated the nonsynonymous (Ka) and synonymous (Ks) substitution ratios for 80 protein-coding genes to estimate the selection pressure on chloroplast genes by comparing L. alpina with 33 other species in Subtribe Swertiinae. Sixty-three protein coding genes could not be calculated because of Ka or Ks = 0, demonstrating that no synonymous or nonsynonymous changes occurred. For the remaining 17 protein-coding genes, the results indicated that the mean Ka/Ks ratio between L. alpina and 33 other Subtribe Swertiinae species ranged from 0.01 (rpl14) to 2.34 (psbB) (Fig. 6). However, the Ka/Ks ratio for most genes was less than one, showing that they underwent negative selection, except for ccsA and psbB, which experienced positive selection (Ka/Ks > 1).

Phylogenetic Analysis

We used the complete chloroplast genome sequences and 80 shared protein sequences of 34 species from Subtribe Swertiinae to construct phylogenetic trees using G. straminea, G. ovata and A. microlobus as outgroups. Phylogenetic trees built with the whole chloroplast genome and CDSs have the same topology (Figure S1). The Bayesian trees demonstrated that all species in the Subtribe Swertiinae formed a monophyletic clade with high support from both Bayesian posterior probabilities (PP = 1; Fig. 7). Additionally, this well-supported clade was divided into two major clades (A and B) within Subtribe Swertiinae. Clade A was located at the base of the phylogenetic tree and was divided into two subclades (A1 and A2). The A1 subclade (P. volubilis) was sister to the A2 subclade consisting of three species of Gentianopsis and V. baillonii. Interestingly, G. paludosa did not cluster with the other two species of the same genus but clustered with V. baillonii, indicating that G. paludosa was closely related to V. baillonii. Clade B contained 29 species from the remaining 6 genera of Subtribe Swertiinae, which formed three main branches in the phylogenetic tree (B1, B2 and B4), that is, subgen. Swertia branch (B1), Gen. Halenia- Swertia dichotoma- Gen. Sinoswertia- Swertia bimaculate branch (B2) and subgen. Ophelia-Gen. Comastoma-L. alpina-L. perenne branch (B4).

IR Contraction and Expansion

We used the IRscope online website (https://irscope.shinyapps.io/irapp/) to visualize the differences in the four boundaries of the LSC, SSC, and IRs. Comparison of all Subtribe Swertiinae plastomes with three outgroups uncovered relatively stable IRs, with little expansion or contraction (Fig. 8). In these 37 plastomes, the LSC-IRa borders were located in the rps19 gene with the exception of the LSC-IRa border of L. perenne, Halenia elliptica and G. ovata. In the outgroup G. ovata, the LSC-IRa border was located within the ndhB gene, while in L. perenne, the LSC-IRa border was located in the rpl22 gene, and the LSC-IRa border had shifted 59 bp. In H. elliptica, the LSC-IRa border was located within the rpl22 gene, which had undergone contraction. The boundary of SSC-IRa was positioned in the ndhF gene, ycf1 pseudogene and the intergenic spacer region between the ycf1 pseudogene and ndhF. The exact position of the SSC-IRb border shifted 10 bp in C. falcatum, 8 bp in S. cincta, 4 bp in S. mussotii, 9 bp in S. dichotoma, 5 bp in S. przewalskii, 15 bp in S. erythrosticta, 10 bp in S. cordata and 3 bp in the outgroup A. microlobus. The SSC/IRa border in all Subtribe Swertiinae plastomes was located inside the ycf1 gene with a few exceptions, and their sequences demonstrated length variabilities among species. The IRa/LSC border in most species’ chloroplast genomes of Subtribe Swertiinae is located at the junction of the trnH gene and the rps19 pseudogene. In the L. perenne chloroplast genome, the trnH gene was included far inside the LSC region, and the rps19 pseudogene was positioned at the IRa/LSC border. In V. baillonii, L. alpina, G. paludosa, G. barbata, C. pulmonarium, S. przewalskii, S. nervosa, S. multicaulis and S. cordata chloroplast genomes, rps19 pseudogenes were lost, and the IRa/LSC border in these chloroplast genomes was positioned at the trnH gene.

Discussion

Organization and Features of cp Genomes

Our study compared the features, content, and organization of the chloroplast genomes of 34 species in Subtribe Swertiinae, demonstrating that all of them exhibited the typical quadripartite structure found in vascular plants [3739]. The length of the chloroplast genomes of 34 species in Subtribe Swertiinae varied from 149,036 (S. pubescens) to 154,365 bp (P. volubilis), implying that they are relatively conserved, revealing only minor differences that changed their sizes. Differences in chloroplast genome length have previously been reported within a genus and a family, such as Swertia (Gentianaceae) [40], Notopterygium (Apiaceae) [41] and Rhodiola (Crassulaceae) [42], and in the subfamily Coryloideae of Betulaceae [43]. In this study, the differences in the chloroplast genome length of 33 species in 9 genera of Subtribe Swertiinae were mainly caused by the expansion and contraction of the IR region [44].

In terms of GC content, the chloroplast genomes of 34 species in Subtribe Swertiinae had similar GC contents (37.5%-38.26%), indicating high species similarity. The GC content in the IR region (43.39%) was higher than that in the other two regions (LSC, 35.92%; SSC, 31.88%), which may be related to the presence of four rRNA sequences in these regions, e.g., rrn16, rrn23, rrn4.5, and rrn5, as previously reported in many complete chloroplast genomes of angiosperms [45].

Regarding gene estimates, we found some differences among the chloroplast genomes of 34 species in Subtribe Swertiinae. Gene numbers ranged from 129 (G. paludosa) to 134 (G. grandis, H. coreana, S. bimaculate, S. diluta, S. leducii, S. mussotii, S. souliei, S. tetraptera, S. verticillifolia and S. wolfgangiana). G. paludosa had 129 genes due to the absence of the ycf15 gene and pseudogenes rps16, rps19, infA and ycf1, while G. grandis, H. coreana, S. bimaculate, S. diluta, S. leducii, S. mussotii, S. souliei, S. tetraptera, S. verticillifolia and S. wolfgangiana contained 134 genes because of a duplication of rps19 and ycf1. In fact, duplicated rps19 and ycf1 pseudogenes have also been reported in other Gentiaceae species [46]. Additionally, there have been reports of the absence of ndh genes in other Gentiaceae species, including ndhA, ndhC, ndhG, ndhH, ndhI, ndhJ, and ndhK [46]. However, the lack of the ycf15 gene has not been reported. Thus, small changes in the content of these genes in the chloroplast genome of Subtribe Swertiinae are caused by evolutionary events of gene deletion and insertion.

Simple sequence repeats (SSRs) and codon usage analysis

Chloroplast SSRs usually show a high level of variation and are widely used in the study of polymorphism, population genetics and phylogenetics [4749]. Our study analyzed the number of different SSR motifs in the cp genome of 34 Subtribe Swertiinae species. Compared with other angiosperms, the number of chloroplast genome SSRs (36–63) of 34 Subtribe Swertiinae species was low to medium. Among the SSRs, a large number of single nucleotide repeats were detected, in which polyA and polyT structures were major players, which was consistent with the results of previous studies [5053]. These SSRs may be useful for subsequent interspecies genomic polymorphism and population genetics based on repeat length polymorphism. People have different views on the mechanism of most SSRs in chloroplast genomes. Slip chain mismatch and intramolecular recombination are currently considered the main mechanisms that cause most SSRs [54].

Previous studies have shown that analysis of codon bias in the chloroplast genome is helpful for understanding the origin and evolution of species [55]. In addition, the frequency of codon use is also related to gene expression. Nucleotide composition is one of the important factors affecting codon use bias. In the genome, AT and GC contents are closely related to synonymous codon use bias. In this study, most amino acids in 34 species chloroplast genomes had codon bias with a high preference (RSCU > 1), apart from methionine and tryptophan (RSCU = 1). The RSCU value of codon types ending with A or U was larger than that ending with G or C, which showed that the codon preferred bases A or U in 34 species chloroplast genomes. Similar conclusions have been made in studies of Cinnamomum camphora [56], Notopterygium [41], Phyllanthaceae [57] and others. Thus, these findings may favor further understanding of the evolutionary history of Subtribe Swertiinae, especially through natural selection and mutation pressures [39].

Comparative Genomes and Characterization of Substitution Rates

Although the chloroplast genome is considered to be fairly conserved in angiosperm, mutational hotspots are often found in the sequences of some closely related species. These mutational hotspots are widely used in plant phylogeny, group genetics and DNA barcode research. In this study, we identified nineteen highly variable regions with high variation rates according to DNAsp analysis, including twelve intergenic regions (trnC-GCA-petN, trnS-GCU-trnR-UCU, ndhC-trnV-UAC, trnC-GCA-petN, psbM-trnD-GUC, trnG-GCC-trnfM-CAU, trnS-GGA-rps4, ndhC-trnV-UAC, accD-psaI, psbH-petB, rpl36-infA and rps15-ycf1) and seven genes (ycf3, petD, ndhF, petL, rpl20, rpl15 and ycf1). Both large-scale studies [58] and specific case studies [5960] have identified mutational hotspots in noncoding regions and coding regions, which can serve as markers with high resolution for phylogenies. For example, rps16-trnQ has been employed for DNA barcoding in phylogenetic studies of 12 different genera in angiosperms because it is highly variable in most plants. Additionally, compared with existing candidate genes, the ycf1 gene is more suitable for barcodes of land plants due to its more variable loci. Therefore, these highly variable regions in the chloroplast genome of Subtribe Swertiinae are expected to afford adequate genetic information to implement studies on species delimitation and the phyletic evolution of Gentiaceae.

Phylogenetic relationships

The topologies of the ML and BI trees constructed with complete chloroplast genome sequences and shared protein-coding gene sequences were consistent, indicating that all 34 Subtribe Swertiinae species formed a monophyletic clade, which was sister to Subtribe Gentianaceae. The monophyly of Subtribe Swertiinae is therefore ascertained by chloroplast genome data, a finding consistent with previous studies [4, 5, 61]. P. volubilis was closely related to Gentianopsis and V. baillonii, which are located at the base of Subtribe Swertiinae. In the other studies, the base groups of Subtribe Swertiinae also included Obliaria, Latouchea, Bartonia and Megacodon. From the analysis of geographical distribution, the basal groups are mostly isolated monospecies or small genera containing only a few species, such as Obliaria (1 species) and Bartonia (4 species), distributed in North America. Latouchea (1 species) and Megacodon (2 species) are distributed in Southwest China and the Himalaya region. Pterygocalyx (1 species) is distributed in Asia, and Veratrilla (2 species) is distributed in southwest China, northeast India, Sikkim and Bhutan. From the perspective of morphology, except for Bartonia (no floral nectary was observed), Obliaria, Megacodon, Latouchea, Gentianopsis and Pterygocalyx all have floral nectaries at the base of the ovary, which is the same as Gentian of the outgroup and different from other genera of Subtribe Swertiinae (most species have floral nectaries on corolla lobes). Thus, nectaries at the base of the ovary may be ancestral characteristics of Subtribe Swertiinae. In terms of basal branches, the phylogeny based on the chloroplast genomes was not totally in accord with that of the study by Cao et al. (2021) [5] and Xi et al. (2014)[4]. In our study, P. volubilis was located at the base of Subtribe Swertiinae and fell within a single clade, while V. baillonii clustered with Gp. paludosa; however, Xi et al. (2014) [4] concluded that P. volubilis clustered with Gentianopsis ciliata, and V. baillonii fell within a single clade. Morphological data show that although Gentianopsis and Pterygocalyx have the same flower morphological characteristics, the plants of Gentianopsis are erect herbs, and their seeds are wingless, while the plants of Pterygocalyx are entwined herbs, and their seeds are winged. The two genera are different in morphology. Our results were consistent with those of morphology. Apart from the base groups, the chloroplast genome sequence data support the formation of three main branches of Subtribe Swertiinae in the phylogenetic tree, that is, subgen. Swertia branch (B1), Gen. Halenia- S. dichotoma- Gen. Sinoswertia- S. bimaculate branch (B2) and subgen. Ophelia-Gen. Comastoma-L. alpina-L. perenne branch (B4). The results of our study and previous studies have shown that Swertia was paraphyly with other related genera, which were distributed in different clades. Therefore, Swertia is presumed to be the main group of Subtribe Swertiinae, and other related genera are derived from Swertia, which are either monophyletic or paraphyly. Although the results of this study provided a new perspective on the intergenic and interspecies relationships of Subtribe Swertiinae, only 34 species were included in our study, and more sampling is needed to construct the phylogeny to better infer the phylogenetic relationships within Subtribe Swertiinae.

Adaptative evolution of Subtribe Swertiinae

Synonymous and nonsynonymous nucleotide substitution patterns play a major role in adaptive evolution. In Subtribe Swertiinae, we did not detect significant positive selection for the majority of genes, with only two genes (ccsA and psbB) revealing possible positive selection; these may have played a vital role in adaptive evolution in Subtribe Swertiinae. Our results were in accordance with a previous study, which showed that ccsA was under positive selection in the chloroplast genome of 15 selected plants in angiosperms [62]. psbB, encoding photosystem subunits (Table 2), plays a vital role in the life history of plants. In addition, the ccsA gene is a c-type cytochrome synthesis gene (Table 2) in plants. The cssA gene is responsible for encoding the cytochrome c synthesis protein, which has approximately 250 ~ 350 amino acids and is a membrane binding protein. The coding product of ccsA can co-form the ccsA complex with the coding protein of another gene, ccsB [63]. Xie et al. (1996) [64] believed that the ccsA gene was related to the binding of cytochrome C-heme. This provides implications for understanding the adaptive evolution of ccsA genes in angiosperms. These genes are highly correlated with physiological processes such as photosynthesis; thus, their positive selection may help Subtribe Swertiinae species quickly adapt to all kinds of environments and enable their wide global distribution.

Conclusion

We presented a comparative analysis of 34 plastomes from 34 Subtribe Swertiinae species and reported a comprehensive study of their phylogenetic relationships, divergence time estimation, and adaptative evolution. The phylogenetic analysis supported the monophyly of Subtribe Swertiinae, and paraphyly of Swertia with other related genera. Considerable inconsistency was observed between the molecular phylogeny and traditional classification of Halenia, Sinoswertia, Comastoma, Lomatogoniopsis and Lomatogonium. Positive selection analyses showed that two genes (ccsA and psbB) were proven to have high Ka/Ks ratios, indicating that chloroplast genes may have undergone positive selection in evolutionary history. These results provide valuable information to elucidate the phylogeny, divergence time and evolution process of Subtribe Swertiinae.

Declarations

Ethics and consent to participate

The authors declared that experimental research works on the plants described in this paper comply with institutional, national and international guidelines. Field studies were conducted in accordance with local legislation and get permissions from provincial department of forest and grass of Qinghai province. Voucher specimens of all plants were deposited at the herbarium of the QTPMB (Qinghai-Tibetan Plateau Museum of biology), Xining, Qinghai Province, China.

Consent for publication

Not applicable. 

Availability of data and materials

All data generated or analyzed during this study are included in this published article and its supplementary information files. The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.
Competing interests

The authors declare no competing interests.
Funding

This work was supported by funds from the Qinghai Province Key Laboratory construction project [2022-ZJ-Y18]. The funders were not involved in the study design, data collection, and analysis, decision to publish, or manuscript preparation.
 Authors' contributions

YL conceived the study, performed data analysis and drafted the manuscript; DS collected samples; ZY and DQ extracted DNA for nextgeneration sequencing; DS, ZY and DQ reviewed the manuscript critically. All authors have read and agreed with the contents of the manuscript.
Acknowledgements

We would like to thank Miss. Jingjing Li and Mr. Hongcai Yue for their help in collection of samples. 

References

  1. Struwe L, Albert VA. Gentianaceae:systematics and natural history. New York: Cambridge University Press. 2002; 242.
  2. von Hagen KB, Kadereit JW. Phylogeny and flower evolution of the Swertiinae (Gentianaceae-Gentianeae): Homoplasy and the principle of variable proportions. Syst Bot. 2002; 27: 548-572
  3. Kadereit JW, von Hagen KB. The evolution of flower morphology in Gentianaceae-Swertiinae and the roles of key innovations and niche width for the diversification of Gentianella and Halenia in South America. Int J Plant Sci. 2003; 164 (5): 441-452. https://doi.org/10.1086/376880
  4. Xi HC, Sun Y, Xue CY. Molecular Phylogeny of Swertiinae (Gentianaceae-Gentianeae) Based on Sequence Data of ITS and matK. Plant Divers Resour. 2014; 36(2):145-156.
  5. Cao Q, Xu LH, Wang JL, Zhang FQ and Chen SL. Molecular phylogeny of subtribe Swertiinae. Bull Bot Res. 2021; 41 (3): 408-418. https://doi.org/10. 7525/j. issn. 1673-5102. 2021. 03. 011
  6. Favre A, Matuszak S, Sun H, Liu, ED, Yuan, YM, Muellner-Riehl, AN. Two new genera of Gentianinae (Gentianaceae): Sinogentiana and Kuepferia supported by molecular phylogenetic evidence. Taxon. 2014; 63(2): 342-354. https://doi.org/10.12705/632.5
  7. Ho TN, Liu SW. A worldwide monograph of Swertia and its allies. Beijing: Science Press. 2015.
  8. Sun SS and Fu PC. Study on Taxonomy and Evolution of Gentianeae (Gentianaceae). Acta Bot Boreal-Occident Sin. 2019; 39(2): 0363-0370.
  9. Refulio-Rodriguez NF, Olmstead RG. Phylogeny of Lamiidae. Am J Bot. 2014; 101(2): 287-299. https://doi.org/10.3732/ajb.1300394
  10. Redwan RM, Saidin A, Kumar SV. Complete chloroplast genome sequence of MD-2 pineapple and its comparative analysis among nine other plants from the subclass Commelinidae. BMC Plant Biol. 2015; 15: 196. https://doi.org/10.1186/s12870-015-0587-1
  11. Fonseca LHM and Lohmann LG. Plastome Rearrangements in the "Adenocalymma-Neojobertia" Clade (Bignonieae, Bignoniaceae) and Its Phylogenetic Implications. Front Plant Sci. 2017; 8: 1875. https://doi.org/10.3389/fpls.2017.01875
  12. Fonseca LHM and Lohmann LG. Combining high-throughput sequencing and targeted loci data to infer the phylogeny of the "Adenocalymma-Neojobertia" clade (Bignonieae, Bignoniaceae). Mol Phylogenet Evol. 2018; 123: 1-15. https://doi.org/10.1016/j.ympev.2018.01.023
  13. Guo LL, Guo S, Xu J, He LX, Carlsond JE, Hou XG, Carlson JE, Hou XG. Phylogenetic analysis based on chloroplast genome uncover evolutionary relationship of all the nine species and six cultivars of tree peony. Ind Crops Prod 2020; 153: 112567. https://doi.org/10.1016/j.indcrop.2020.112567
  14. Jiang Y, Miao YJ, Qian J, Zheng Y, Xia CL, Yang QS, Liu C, Huang LF, Duan, BZ. Comparative analysis of complete chloroplast genome sequences of five endangered species and new insights into phylogenetic relationships of Paris. Gene. 2021; 833: 146572. https://doi.org/10.1016/j.gene.2022.146572
  15. Zhang W, Wang HY, Dong JH, Zhang TJ, and Xiao HX. Comparative chloroplast genomes and phylogenetic analysis of Aquilegia. Appl Plant Sci. 2021; 9(3): e11412. https://doi.org/10.1002/aps3.11412
  16. Tang CQ, Chen X, Deng YF, Geng LY, Ma JH and Wei XY. Complete chloroplast genomes of Sorbus sensu stricto (Rosaceae): comparative analyses and phylogenetic relationships. BMC Plant Biol. 2022; 22(1): 495. https://doi.org/10.1186/s12870-022-03858-5.
  17. Cui N, Chen WX, Li XW, Wang P. Comparative chloroplast genomes and phylogenetic analyses of Pinellia. Mol Biol Rep. 2022; 49:7873-7885. https://doi.org/10.1007/s11033-022-07617-5
  18. Doyle J. “DNA protocols for plants-CTAB total DNA isolation,” in Molecular techniques in taxonomy. Editors G. M. Hewitt and A. Johnston (Berlin: Springer).
  19. Patel RK and Jain M (2012). NGS qc toolkit: A toolkit for quality control of next generation sequencing data. PLoS One. 1991; 7: e30619. https://doi.org/10.1371/journal.pone.0030619
  20. Bakker FT, Lei D, Yu JY, Mohammadin S, Wei Z, van de Kerke S, Gravendeel B, Nieuwenhuis M, Staats M and Alquezar-Planas DE. Herbarium genomics: Plastome sequence assembly from a range of herbarium specimens using an iterative organelle genome assembly pipeline. Biol J Linn Soc. 2016; 117: 33-43. https://doi.org/10.1111/bij.12642
  21. Prjibelski A, Antipov D, Meleshko D, Lapidus A and Korobeynikov A. Using SPAdes de novo assembler. Curr Protoc Bioinforma. 2020; 70 (1): e102.
  22. Kearse M, Moir R, Wilson A, Stones-Havas S, Cheung M, Sturrock S, Buxton S, Cooper A, Markowitz S and Duran, C. Geneious Basic: An integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics. 2012; 28 (12): 1647-1649. https://doi.org/10.1093/bioinformatics/bts199
  23. Tillich M, Lehwark P, Pellizzer T, Ulbricht-Jones ES, Fischer A, Bock R, Greiner S. GeSeq—Versatile and accurate annotation of organelle genomes. Nucleic Acids Res. 2017; 45: W6-W11. https://doi.org/10.1093/nar/gkx391.
  24. Qu XJ, Moore MJ, Li DZ and Yi TS. PGA: A software package for rapid, accurate, and flexible batch annotation of plastomes. Plant Methods. 2019; 15: 50. https://doi.org/10.1186/s13007-019-0435-7
  25. Marc L, Oliver D, Sabine K and Ralph B OrganellarGenomeDRAW—a suite of tools for generating physical maps of plastid and mitochondrial genomes and visualizing expression data sets. Nucleic Acids Res. 2013; 41: 575–581. https://doi.org/10.1093/nar/gkt289.
  26. Beier S, Thiel T, Münch T, Scholz U, Mascher M. MISA-web: A web server for microsatellite prediction. Bioinformatics. 2017; 33: 2583-2585. https://doi.org/10.1093/bioinformatics/btx198
  27. Sun X, Yang Q, Xia X. An improved implementation of effective number of codons (nc). Mol Biol Evol. 2013; 30:191-196. https://doi.org/10. 1093/molbev/mss201
  28. Amiryousefi A, Hyvönen J and Poczai P. IRscope: An online program to visualize the junction sites of chloroplast genomes. Bioinformatics. 2018; 34 (17): 3030-3031. https://doi.org/10.1093/bioinformatics/bty220
  29. Frazer KA, Pachter L, Poliakov A, Rubin EM and Dubchak I. Vista: Computational tools for comparative genomics. Nucleic Acids Res. 2004; 32 (Suppl.2): W273-W279. https://doi.org/10.1093/nar/gkh458
  30. Zhang YJ, Ma PF and Li D Z. High-throughput sequencing of six bamboo chloroplast genomes: Phylogenetic implications for temperate woody bamboos (Poaceae: Bambusoideae). PLoS One. 2011; 6: e20596. https://doi.org/10.1371/journal.pone.0020596
  31. Lawrie DS, Messer PW, Hershberg R, Petrov DA. Strong purifying selection at synonymous sites in D. melanogaster. PLoS Genet. 2013; 9: e1003527. https://doi.org/10.1371/journal.pgen.1003527
  32. Wang DP, Zhang YB, Zhang Z, Zhu J and Yu J. KaKs_Calculator 2.0: A Toolkit Incorporating Gamma-Series Methods and Sliding Window Strategies. Genom Proteom Bioinf. 2010; 8(1): 77-80. http://doi.org/10.1016/S1672-0229(10)60008-3
  33. Kazutaka K and Standley DM. MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Mol Biol Evol. 2013; 30(4): 772-780. https://doi.org/10.1093/molbev/mst010
  34. Darriba D, Taboada GL, Doallo R, Posada D. jModelTest 2: More models, new heuristics and parallel computing. Nat Methods. 2012; 9: 772. https://doi.org/10.1038/nmeth.2109
  35. Ronquist F, Teslenko M, van der Mark P, Ayres DL, Darling A, Hohna S, Larget B, Liu L, Suchard MA, Huelsenbeck JP. MrBayes 3.2: Efficient Bayesian Phylogenetic Inference and Model Choice Across a Large Model Space. Syst Biol. 2012; 61(3): 539-542. https://doi.org/10.1093/sysbio/sys029
  36. Rambaut A, Drummond AJ, Xie D, Baele G, Suchard MA. Posterior Summarization in Bayesian Phylogenetics Using Tracer 1.7. Syst Biol. 2018; 67(5): 901-904. https://doi.org/10.1093/sysbio/syy032
  37. Wicke S, Schneeweiss GM, Depamphilis CW, Müller KF, and Quandt D. The evolution of the plastid chromosome in land plants: gene content, gene order, gene function. Plant Mol Biol. 2011; 76: 273-297. http://doi.org/10.1007/s11103-011-9762-4
  38. Tonti-Filippini J, Nevill PG, Dixon K, Small I. What can we do with 1000 plastid genomes? Plant J. 2017; 90:808–818. https://doi.org/10.1111/tpj.13491
  39. Zhang L, Wang S, Su C, Harris AJ, Zhao L, Su N, Wang JR, Duan L, Chang ZY. Comparative chloroplast genomics and phylogenetic analysis of Zygophyllum (Zygophyllaceae) of China. Front Plant Sci. 2021; 12:723622. https://doi.org/10.3389/fpls.2021.723622
  40. Yang LC, Li JJ and Zhou GY. Comparative chloroplast genome analyses of 23 species in Swertia L. (Gentianaceae) with implications for its phylogeny. Front Genet. 2022; 13:895146. https://doi.org/10.3389/fgene.2022.895146
  41. Yang J, Yue M, Niu C, Ma XF and Li ZH. Comparative Analysis of the Complete Chloroplast Genome of Four Endangered Herbals of Notopterygium. Genes. 2017; 8: 124. https://doi.org/10.3390/genes8040124
  42. Zhao DN, Ren Y, Zhang JQ. Conservation and innovation: Plastome evolution during rapid radiation of Rhodiola on the Qinghai-Tibetan Plateau. Mol Phylogenet Evol. 2020; 144: 106713. https://doi.org/10.1016/j.ympev.2019.106713
  43. Hu GL, Cheng LL, Huang WG, Cao QC, Zhou L, Jia WS, Lan YP. Chloroplast genomes of seven species of Coryloideae (Betulaceae): Structures and comparative analysis. Genome. 2020; 63: 337-348. https://doi.org/10.1139/gen-2019-0153
  44. Huang R, Xie X, Li F, Tian EW, Chao Z. Chloroplast genomes of two Mediterranean Bupleurum species and the phylogenetic relationship inferred from combined analysis with East Asian species. Planta. 2021; 253: 81. https://doi.org/10.1007/s00425-021-03602-7
  45. Chen XC, Li QS, Li Y, Qian J, Han JP. Chloroplast genome of Aconitum barbatum var. puberulum (Ranunculaceae) derived from CCS reads using the PacBio RS platform. Front Plant Sci. 2015; 6:42. https://doi.org/10.3389/fpls.2015.00042
  46. Dong BR, Zhao ZL, Ni LH, Wu JR and Danzhen ZG. Comparative analysis of complete chloroplast genome sequences within Gentianaceae and significance of identifying species. Chin Tradit Herb Drugs. 2020; 51 (6):1641-1649. http://doi.org/10.7501/j.issn.0253-2670.2020.06.033
  47. Ebert D, Peakall R. Chloroplast simple sequence repeats (cpSSRs): Technical resources and recommendations for expanding cpSSR discovery and applications to a wide array of plant. Mol Ecol Resour. 2009; 9: 673-690. https://doi.org/10.1111/j.1755-0998.2008.02319.x
  48. George BJ, Bhatt BS, Awasthi M, George B, Singh AK. Comparative analysis of microsatellites in chloroplast genomes of lower and higher plants. Curr Genet. 2015; 61:665-677. https://doi.org/10.1007/s00294-015-0495-9
  49. Khan G, Zhang FQ, Gao QB, Fu PC, Zhang Y, Chen SL. Spiroides shrubs on Qinghai-Tibetan Plateau: multilocus phylogeography and palaeodistributional reconstruction of Spiraea alpina and S. Mongolica (Rosaceae). Mol Phylogenet Evol. 2018; 123:137-48. https://doi.org/10.1016/j.ympev.2018.02.009
  50. Hu Y, Woeste KE, Zhao P. Completion of the Chloroplast Genomes of Five Chinese Juglans and Their Contribution to Chloroplast Phylogeny. Front Plant Sci. 2017; 7: 1955. https://doi.org/10.3389/fpls.2016.01955
  51. Lin M, Qi X, Chen J, Sun, L, Zhong Y, Fang J, Hu, C. The complete chloroplast genome sequence of Actinidia arguta using the PacBio RS II platform. PLoS ONE. 2018; 13: e0197393. https://doi.org/10.1371/journal.pone.0197393
  52. Mehmood F, Abdullah, Ubaid Z, Bao Y, Poczai P. Comparative Plastomics of Ashwagandha (Withania, Solanaceae) and Identification of Mutational Hotspots for Barcoding Medicinal Plants. Plants. 2020; 9: 752. https://doi.org/10.3390/plants9060752
  53. Kim SC, Lee JW and Choi BK. Seven Complete Chloroplast Genomes from Symplocos: Genome Organization and Comparative Analysis. Forests. 2021; 12: 608. https://doi.org/10.3390/f12050608
  54. Ochoterena H. Homology in coding and noncoding DNA sequences: a parsimony perspective. Plant Syst Evol. 2009; 282: 151-168.
  55. Lopez JL, Lozano MJ, Lagares, A, Fabre, ML, Draghi, WO, Del Papa MF, Pistorio M, Becker A, Wibberg D, Schluter A, Puhler A, Blom J, Goesmann A, Lagares A. Codon Usage Heterogeneity in the Multipartite Prokaryote Genome: Selection-Based Coding Bias Associated with Gene Location, Expression Level, and Ancestry. mBio. 2019; 10(3): e00505-19 https://doi.org/10.1128/mBio.00505-19
  56. Qin Z, Zheng YJ, Gui LJ, Xie GA, Wu YF. Codon usage bias analysis of chloroplast genome of camphora tree (Cinnamomum camphora). Guihaia. 2018; 38(10): 1346-1355.
  57. Rehman U, Sultana N, Abdullah, Jamal A, Muzaffar M, Poczai P. Comparative Chloroplast Genomics in Phyllanthaceae Species. Diversity. 2021; 13(9): 403, https://doi.org/10.3390/d13090403
  58. Shaw J, Shafer HL, Leonard OR, Kovach MJ, Schorr M, Morris AB. Chloroplast DNA sequence utility for the lowest phylogenetic and phylogeographic inferences in angiosperms: the tortoise and the hare IV. Am J Bot. 2014; 101:1987-2004. https://doi.org/10.3732/ajb.1400398
  59. Alwadani KG, Janes JK, Andrew RL. Chloroplast genome analysis of boxironbark Eucalyptus. Mol Phylogenet Evol. 2019; 136:76-86. https://doi.org/10.1016/j.ympev.2019.04.001
  60. Ye WQ, Yap ZY, Li P, Comes HP, Qiu YX. Plastome organization, genomebased phylogeny and evolution of plastid genes in Podophylloideae (Berberidaceae). Mol Phylogenet Evol. 2018; 127: 978-987. https://doi.org/10.1016/j.ympev.2018.07.001
  61. Chassot P, Nemomissa S, Yuan Y M and Kupfer P. High paraphyly of Swertia L. (Gentianaceae) in the Gentianella-lineage as revealed by nuclear and chloroplast DNA sequence variation. Plant Syst Evol. 2001; 229 (1-2), 1-21. https://doi.org/10.1007/s006060170015
  62. Wang B, Gao L, Su YJ and Wang T. Adaptive Evolutionary Analysis of Chloroplast Genes in Euphyllophytes Based on Complete Chloroplast Genome Sequences. Acta Sci Nat Univ Sunyatseni. 2012; 51(3): 108-114.
  63. Hartshorne RS, Kern M, Meyer B, Clarke TA, Karas M, Richardson DJ, Simon J. A dedicated haem lyase is required for the maturation of a novel bacterial cytochrome c with unconventional covalent haem binding. Mol Microbiol. 2007; 64: 1049-1060. https://doi.org/10.1111/j.1365-2958.2007.05712.x
  64. Xie Z, Merchant S. The plastid-encoded ccsA gene is required for heme attachment to chloroplast c-type cytochromes. J Biol Chem. 1996; 271 (9): 4632-4639. https://doi.org/10.1074/jbc.271.9.4632