DOI: https://doi.org/10.21203/rs.3.rs-87181/v1
Background: Broussonetia kazinoki × Broussonetia papyifera (ZJGS) is a hybrid species, which has a very complicated hybrid origin. Its excellent characteristics make it both ecological benefits and economically valuable.
Results: This study aimes to further understand ZJGS and Moraceae taxa through the ZJGS chloroplast (cp) genome structure and the comparative with 12 closely related Moraceae species, especially the cross parent B. kazinoki and B. papyrifera. The analyses show that ZJGS cp genome is 160,903 bp in length. Among the 13 Moraceae species, the cp genome length of seven Broussonetia species (ranges from 160,239 bp to 162,594 bp) is larger than that of six Morus species (ranges from 158,459 bp to 159,265 bp). However, the average GC content of Broussonetia species is lower than that of Morus species, which is 35.72% and 36.26%, respectively. Compared with Moraceae species, the ZJGS cp genome has a high degree of sequence similarity with B. kazinoki and B. monoica. In the comparison of repeated sequences, ZJGS and its maternal species B. kazinoki shows similar simple sequence repeats (SSRs) frequency. Among the 77 shared protein-coding genes (PCGs) in Moraceae species, the obvious positive selection of Ka/Ks ratios acted on petD and rpl16 genes of B. kazinoki and B. papyrifera, respectively, and the Ka/Ks ratio > 3. Phylogenetic analysis based on shared PCGs from 28 species shows that ZJGS is closely related to maternal B. kazinoki.
Conclusions: These findings provide data support for the origin of ZJGS hybridization, and provide genomic resources for future ZJGS resource development and molecular breeding.
There are many hybridization phenomena in nature, which are regarded as effective methods to produce "positive species" [1–3]. Hybrid breeding is widely used in forestry production, specially in horticulture. At present, there are already many hybrid breeds used in agriculture and forestry production based on the heterosis. The hybrid aspen (Populus tremula × Populus tremuloides) has a higher yield than the parent species, and can be quickly regenerated from the root sucker [4–6]. Hybrids of Abies genus have the advantage of strong anti-pollution ability and resistance to pests and diseases [7, 8]. The heterosis is predicted by both positive gene contribution and environment. Hybridization is also an important way to produce gene diversity. Biologists have always focused on the role of hybridization in evolution. In-depth analyses of heterozygous genotypes showed that heterosis mainly came from the accumulation of a large number of rare superior alleles with positive dominance [9]. Lin et al. identified five candidate genes involved in high yield in super hybrid rice LYP9, among which heterozygous segments containing qSS7 and qHD8 showed superiority and contributed to heterosis [10]. In maize, the silent expression of Cell Number Regulator 1 gene (CNR1) can increase the size of plant and organ, and become a direct contributor to heterosis [11]. The nucleotide sequence of the plastid gene rbcL and the nuclear gene PgiC were used to study the reticulated evolution of the Dryopteris varia complex in Japan and a haplotype not belong to any existing species was found. It was speculated that it came from an extinct species [12]. This indicates that hybrids may have higher fitness due to heterosis, and become an effective way to preserve their genetic diversity. However, the above genetic information related to hybridization are mostly concentrated in nuclear genome. The chloroplast (cp) genome is the second largest genome after the nuclear genome, and has significant advantages in the study of phylogenetic evolution of species.
Chloroplast is a uniparentally inherited and stable organelle in green plants, who is often considered to be an informative resource for genetic diversity of plant [13–15]. The cp genome always consists of four conserved regions: a pair of inverted repeats (IRs), a small single-copy region (SSC) and a large single-copy region (LSC) [16–18]. Among these regions, the length of IRs regions is always diverse and determines the total length of cp genome [19, 20]. The cp genome contains a large number of functional genes, which can be divided into three categories: genes related to photosynthesis (Photosystem I: psa; Photosystem II: psb), genes related to gene expression (ribosomal RNA genes and transfer RNA genes, etc.), and other genes related to biosynthesis (ATP synthase gene, NADH dehydrogenase gene, etc.). It has been reported that matK and ccsA genes evolve faster than other gene groups [21, 22]. As the two largest genes, ycf1 and ycf2 located at the IR/SC junction and the IRs region, respectively, which proved to be useful for analyzing cp genome variation in higher plants [23]. As the development of sequencing technology, cp genome sequences are more and more available. The phylogenetic analyses based on the cp genome are achievable and revealed qualitative and valuable information to confirm the genetic relationship of different species [24, 25]. For instance, rice is an important cereal crop, so cultivating abundant varieties sounds vital for world’s population. The new cultivars are always evaluated by comparing to other ones on the affinity that according to the cp genome, and more than 20 species were categorized into 10 genome types [26]. To elucidate the evolution within neotropical-paleotropical bamboos, the cp genome of woody bamboo Guadua angustifolia was characterized and compared with other species, the results showed that the speciation events of extant species occurred during or after the Pliocene [27]. The complete cp genome sequence of Morus cathayana and Morus multicaulis were obtained and compared with other genus Morus, the results indicated that both natural selection and mutational bias have contributed to the codon bias [28].
Paper mulberry is a member of genus Broussonetia in Moraceae family. The tree displayed the features on fast-growing, easy breeding, resistant to pruning and widely distributing [29–31]. Vegetation survey in tailing’s area showes that the tree is an excellent native plant having a comprehensive absorption effect for various heavy metals [32]. Meanwhile, the leaves are an important source of pig feed and some chemical substances are also important resources for medicine [33]. Most importantly, as the name implies, because of its long fiber and ease of preparation, paper mulberry contributed to the invention of papermaking and then was also used in barkcloth [34]. Based on the multiple uses of paper mulberry, paper mulberry has historically been an important economic tree species and even accompanied by human expansion [35]. At the same time, humans continue to select varieties of paper mulberry. Now, there is some confusion in the name of paper mulberry. Paper mulberry, Broussonetia papyifera, B. kazinoki × B. papyifera are all used to called the tree, while it may not the same tree [36]. In recent year, a hybrid variety for paper mulberry was bred [15, 37]. The breeding process is showed in Fig. 1 and in order to standardize the name, we called it as the hybrid paper mulberry (ZJGS) in the study. ZJGS is a hybrid between B. kazinoki and B. papyrifera, whose maternal lineage is B. kazinoki and paternal is B. papyrifera, respectively. With a long breeding process of more than ten years, the researchers first used modern breeding technology and then loaded the hybrid materials into the spacecraft and bred by space mutation. The mutant materials were optimized and cultivated after the spacecraft returned to the ground, and finally hybrid plants with excellent traits were obtained. However, in recent years, B. kazinoki was found to be probably a naturally occurring hybrid, with B. monoica as the female parent and B. papyifera as the male parent [15]. The complicated hybridization process has blurred the genetic background of paper mulberry and hindered the determination of the classification status of the tree and its utilization.
Due to the tree unclear genome is always large, a few forest tree genomes have been sequenced. Among the genetic resources of ZJGS, only unclear genome of B. papyrifera was assembled [38]. Lacking of molecular genetic information about ZJGS may prevent optimal breeding of the tree and the exploration of adaptability mechanisms. In order to further determine the characteristics of the cp genome during the hybridization process, the cp genome of ZJGS was compared with 12 Moraceae species, including B. papyrifera and B. kazinoki. The research is not only helpful for us to further determine the status of hybrid paper mulberry, but also for us to explore the role of chloroplasts in functional selection.
The cp genome of ZJGS is a circular molecule of 160,903 bp in length (Additional file 1: Table S1). 135 functional genes were identified in the ZJGS cp genome, including 81 PCGs, 46 tRNA genes, and eight rRNA genes (Table 1). Of the 81 PCGs, 77 unique genes were generated, of which four genes, rps7, rpl23, ndhB, and ycf2, are distributed in the IRs repeat regions (IRa/IRb). 15 tRNA genes and four rRNA genes are also repeated in the IRs regions. In the ZJGS cp genome, a total of eight different genes contain introns, even the longest intron is ndhA and has a length of 1,117 bp (Additional file 2: Table S2).
Group of genes | Gene name |
---|---|
Ribosomal RNA genes | rrn16(*2), rrn23(*2), rrn4.5(*2), rrn5(*2) |
Transfer RNA genes | trnA-UGC(*4), trnC-GCA, trnD-GUC, trnE-UUC, trnF-GAA, trnfM-CAU, trnG-GCC, trnG-UCC, trnH-GUG, trnI-CAU(*2), trnI-GAU(*4), trnK-UUU, trnL-CAA(*2), trnL-UAA(*2), trnL-UAG, trnM-CAU(*2), trnN-GUU(*2), trnP-GGG, trnP-UGG, trnQ-UUG, trnR-ACG(*2), trnR-UCU, trnS-GCU, trnS-GGA, trnS-UGA, trnT-GGU(*2), trnT-UGU, trnV-GAC(*2), trnV-UAC(*2), trnW-CCA, trnY-GUA |
Small subunit of ribosomal proteins | rps11, rps12, rps14, rps15, rps16, rps18, rps19, rps2, rps3, rps4, rps7(*2), rps8 |
Large subunit of ribosomal proteins | rpl14, rpl16, rpl2a, rpl20, rpl23(*2), rpl32, rpl33, rpl36 |
RNA polymerase subunits | rpoA, rpoB, rpoC1a, rpoC2 |
Large subunit of rubisco | rbcL |
Photosystem I | psaA, psaB, psaC, psaI, psaJ |
Photosystem II | psbA, psbB, psbC, psbD, psbE, psbFa, psbH, psbI, psbJ, psbK, psbM, psbN, psbT, psbZ |
NADH dehydrogenase | ndhAa, ndhB(*2)a, ndhC, ndhD, ndhE, ndhF, ndhG, ndhH, ndhI, ndhJ, ndhK |
Cytochrome b6/f complex | petA, petB, petD, petG, petL, petN |
ATP synthase | atpA, atpB, atpE, atpF, atpH, atpI |
Conserved Open Reading Frames | ycf1a, ycf2(*2), ycf3b, ycf4 |
Maturase | matK |
Protease | clpPb |
Subunit of Acetyl-CoA-carboxylase | accD |
c-type cytochrom synthesis gene | ccsA |
heme-binding protein | cemA |
*2 means two gene copies and *4 means four gene copies; a gene containing a single intron; b gene containing two introns. |
In the Moraceae species of comparison, the cp genome length of all seven Broussonetia (ranges from 160,239 bp to 162,594 bp) is higher than that of the six Morus (ranges from 158,459 bp to 159,265 bp), but the total GC content is relatively lower in the genus Broussonetia, the average value is 35.72%, which in Morus is 36.26% (Table 2). From the perspective of cp genome length, the species that are closer to ZJGS are B. papyifera, B. kazinoki, B. monoica and B. kaempferi, and the GC content of B. kazinoki is most similar to ZJGS. The length ratio of the IRs regions in ZJGS cp genome is the lowest among all the comparative species, which is 16.04%. In addition, the ZJGS cp genome contains the largest number of genes, while the number of genes that direct synthetic proteins is the least, only 81.
Species | LSC | SSC | IR | Total | Number of genes | PCGs | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Length (bp) | GC% | Length (%) | Length (bp) | GC% | Length (%) | Length (bp) | GC% | Length (%) | Length (bp) | GC% | |||
ZJGS | 89,220 | 33.33 | 55.45 | 20,079 | 28.50 | 12.48 | 25,802 | 42.72 | 16.04 | 160,903 | 35.74 | 135 | 81 |
B. papyrifera | 88,621 | 33.50 | 55.31 | 19,918 | 28.49 | 12.43 | 25,850 | 42.66 | 16.13 | 160,239 | 35.83 | 132 | 88 |
B. kazinoki | 89,066 | 33.34 | 55.38 | 20,093 | 28.47 | 12.49 | 25,841 | 42.68 | 16.07 | 160,841 | 35.73 | 125 | 87 |
B. monoica | 88,993 | 33.37 | 55.35 | 20,076 | 28.51 | 12.49 | 25,854 | 42.68 | 16.08 | 160,777 | 35.76 | 130 | 85 |
B. kaempferi | 88,956 | 33.21 | 55.38 | 19,965 | 28.37 | 12.43 | 25,852 | 42.64 | 16.09 | 160,625 | 35.65 | 132 | 87 |
B. luzonica | 89,980 | 33.29 | 55.34 | 19,560 | 28.52 | 12.03 | 26,527 | 42.32 | 16.31 | 162,594 | 35.66 | 132 | 87 |
B. kurzii | 90,174 | 33.24 | 55.61 | 20,150 | 28.51 | 12.43 | 25,923 | 42.66 | 15.99 | 162,170 | 35.66 | 132 | 87 |
M. cathayana | 88,143 | 33.77 | 55.34 | 19,844 | 29.20 | 12.46 | 25,639 | 42.95 | 16.10 | 159,265 | 36.16 | 129 | 85 |
M. multicaulis | 87,940 | 33.82 | 55.27 | 19,809 | 29.26 | 12.45 | 25,677 | 42.91 | 16.14 | 159,103 | 36.19 | 129 | 85 |
M. indica | 87,386 | 34.12 | 55.14 | 19,742 | 29.35 | 12.46 | 25,678 | 42.92 | 16.20 | 158,484 | 36.37 | 130 | 84 |
M. mongolica | 87,367 | 33.97 | 55.14 | 19,736 | 29.33 | 12.45 | 25,678 | 42.92 | 16.20 | 158,459 | 36.29 | 127 | 89 |
M. notabilis | 87,470 | 34.11 | 55.12 | 19,776 | 29.34 | 12.46 | 25,717 | 42.89 | 16.21 | 158,680 | 36.36 | 129 | 83 |
M. atropurpurea | 87,761 | 33.90 | 55.16 | 19,875 | 29.30 | 12.49 | 25,707 | 42.90 | 16.16 | 159,113 | 36.20 | 126 | 89 |
In order to detect the global sequence variability of the ZJGS cp genome, the sequence of ZJGS was used as a reference to compare with the other Moraceae species (Fig. 2). It is found that the ZJGS cp genomic sequence has the highest similarity with B. kazinoki and B. monoica by comparison. Furthermore, the sequence of its paternal B. papyrifera exhibits higher divergence. These highly different regions include matK, rps16, atpF, rpoC2, rpoC1, ycf3, clpP, ndhF, ccsA, ndhA and ycf1. It has become apparent to us that these genes are good sources for interspecies phylogenetic analysis and evolutionary studies. There are also some genes that are relatively conserved among these species, such as atpH, petN, psbE, rpl2, psbH, psbM and rps7. As a whole, the non-coding sequence of the Moraceae plants is more divergent than the coding sequence, and the sequence variation in the IRs regions is smaller than that in the LSC and SSC regions, which also shows that the IRs regions are more conservative than the LSC and SSC regions.
The BRIG image shows that the cp genome sequence of the Moraceae species is conserved (Fig. 3). We use the ZJGS cp genome as the reference sequence, which is distributed in the innermost circle of the concentric circle, and the outermost circle is the PCGs of ZJGS. Here, it can be clearly seen that the cp genome of the six innermost Broussonetia genus species is more closed than the other Morus species in the outer circle compared to the reference cp genome sequence, indicating that the cp genome sequence of Broussonetia species have high similarity with ZJGS. In addition, from the figure we can also intuitively see that the most conservative sequence is located in the IRs regions, and the GC content is also the highest among all species.
Through the previous study of sequence diversity, we know that the IRs regions are the most conserved regions in the cp genome. Most studies have shown that the contraction and expansion of the IRs regions determine the length of the cp genome, and this change directly affects the length and distribution of genes at the boundary of this region [23, 39–41]. Intuitive results show that the gene composition in IR/SC junction of ZJGS cp genome is somewhat different from that of several other species (Fig. 4). Among them, the most obvious is that on the left of the IRb/LSC region is the gene rpl23, while the remaining species are rpl2. Same as B. monoica, B. kaempferi, B. luzonica and B. kurzii, in IRa /SSC region, ycf1 is replaced by trnN in ZJGS cp genome. However, the genetic constitution in the two parent species of ZJGS at the IR/SC junction is similar. In terms of gene length,the rpl2 gene (401 bp) in the ZJGS cp genome is significantly shortened in the IRa region and is more than three times shorter than other species (1,509 bp), which is related to the IRa boundary contraction.
In this study, the SSRs of ZJGS and 12 Moraceae species cp genome were detected (Fig. 5). Similarly, all cp genomes of Moraceae plants have mononucleotides, dinucleotides, trinucleotides, tetranucleotides and pentanucleotides. While, hexanucleotide repeats are only detected in B. papyifera and B. kurzii. Moreover, the number of SSRs in Broussonetia species is higher than that of Morus species, except for B. papyifera. On the whole, ZJGS and its maternal species B. kazinoki have the most similar performance among different types of SSRs (Table 3). Mainly reflected in the following SSRs: C/G, AAT/ATT, AAG/CTT, AAAT/ATTT, AGAT/ATCT, ATCG/ATCG, AAATT/AATTT, AATAG/ATTCT, AATAT/ATATT, and AAATGT/ACATTT, which shows the characteristics of maternal inheritance of ZJGS cpDNA. In order to obtain more genetic regulation information, the long repeat sequences of 13 Moraceae plants were further analyzed (Fig. 6A). Four types of repeats were detected in the cp genome of M. atropurpurea and Broussonetia species except for B. luzonica. And the number of palindromic repeat is the largest, then the type of forward repeat and reverse repeat, the least is the complementary repeat. The forward repeats, palindromic repeats and complementary repeats in the ZJGS cp genome all contain the most repeting elements. Repetitive sequences of 20–30 bp in length reach 50% or more (Fig. 6B-E). It can be clearly seen that the long repeat sequences of the ZJGS cp genome is dominant in this interval, and these repeat elements can be used as important genetic resources for the genetics and phylogenetic of the ZJGS population.
SSR | Species | Number of repeats | |||||||||||||||
3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | total | ||
A/T | ZJGS | - | - | - | - | - | - | - | 15 | 18 | 20 | 5 | 3 | 6 | 1 | - | 68 |
B. papyrifera | - | - | - | - | - | - | - | 27 | 11 | 2 | 6 | 2 | 4 | 1 | 1 | 54 | |
B. kazinoki | - | - | - | - | - | - | - | 18 | 17 | 19 | 3 | 7 | 3 | - | - | 67 | |
C/G | ZJGS | - | - | - | - | - | - | - | 1 | - | - | - | - | - | - | - | 1 |
B. papyrifera | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | |
B. kazinoki | - | - | - | - | - | - | - | 1 | - | - | - | - | - | - | - | 1 | |
AG/CT | ZJGS | - | 20 | 1 | - | - | - | - | - | - | - | - | - | - | - | - | 21 |
B. papyrifera | - | 20 | 1 | - | - | - | - | - | - | - | - | - | - | - | - | 21 | |
B. kazinoki | - | 20 | 1 | - | - | - | - | - | - | - | - | - | - | - | - | 21 | |
AT/AT | ZJGS | - | 47 | 7 | 2 | 2 | - | - | 1 | - | - | - | - | - | - | - | 59 |
B. papyrifera | - | 38 | 12 | 2 | - | 1 | - | - | - | - | - | - | - | - | - | 53 | |
B. kazinoki | - | 20 | 1 | - | - | - | - | - | - | - | - | - | - | - | - | 21 | |
AAT/ATT | ZJGS | - | 4 | - | - | - | - | - | - | - | - | - | - | - | - | - | 4 |
B. papyrifera | - | 2 | - | - | - | - | - | - | - | - | - | - | - | - | - | 2 | |
B. kazinoki | - | 4 | - | - | - | - | - | - | - | - | - | - | - | - | - | 4 | |
AAG/CTT | ZJGS | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 |
B. papyrifera | - | 1 | - | - | - | - | - | - | - | - | - | - | - | - | - | 1 | |
B. kazinoki | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | |
AAAT/ATTT | ZJGS | 4 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 4 |
B. papyrifera | 4 | 2 | - | - | - | - | - | - | - | - | - | - | - | - | - | 6 | |
B. kazinoki | 4 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 4 | |
AATT/AATT | ZJGS | 1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 1 |
B. papyrifera | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | |
B. kazinoki | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | |
AGAT/ATCT | ZJGS | - | 1 | - | - | - | - | - | - | - | - | - | - | - | - | - | 1 |
B. papyrifera | 1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 1 | |
B. kazinoki | - | 1 | - | - | - | - | - | - | - | - | - | - | - | - | - | 1 | |
ATCG/ATCG | ZJGS | 1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 1 |
B. papyrifera | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | |
B. kazinoki | 1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 1 | |
AAAGG/CCTTT | ZJGS | 1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 1 |
B. papyrifera | 1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 1 | |
B. kazinoki | 1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 1 | |
AAATT/AATTT | ZJGS | 2 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 2 |
B. papyrifera | 1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 1 | |
B. kazinoki | 2 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 2 | |
AATAG/ATTCT | ZJGS | 1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 1 |
B. papyrifera | - | 2 | - | - | - | - | - | - | - | - | - | - | - | - | - | 2 | |
B. kazinoki | 1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 1 | |
AATAT/ATATT | ZJGS | 1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 1 |
B. papyrifera | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | |
B. kazinoki | 1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 1 | |
AATCT/AGATT | ZJGS | 1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 1 |
B. papyrifera | 1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 1 | |
B. kazinoki | 1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 1 | |
AAATGT/ACATTT | ZJGS | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 |
B. papyrifera | 1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 1 | |
B. kazinoki | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 |
Ka/Ks ratio represent the ratio between non-synonymous substitution (Ka) and synonymous substitution (Ks) at a particular site, which is used to infer the direction and magnitude of natural selection acting on PCGs. A ratio > 1 means positive selection; < 1 means purifying selection; and = 1 indicates no selection [42, 43]. According to previous study, except for the most genes with faster evolution rate, the frequency of synonymous nucleotide substitutions is higher than non-synonymous substitutions due to the purification selection [40]. In this study, the cp genome sequence of ZJGS was used as reference, the Ka/Ks ratio of Moraceae species was presented (Fig. 7, Additional file 3:Table S3). The Ka/Ks ratio of such genes: atpH, pabC, petG, petN, psaC, psbE-psbI, psbL-psbT, rps7 and rps16 is 0, which reflects the strong purification options and also indicates that these genes are highly conserved. On the contrary, most of the NADH genes (ndhA-ndhK) have higher Ka/Ks ratio, and in the comparison of all species, the Ka/Ks ratio of atpF, ndhB and rpoC1 is > 1, these three genes are strongly positively selected. Except for the above three genes, clpP, ndhA, ndhD, petB and petD genes are also positively selected compared with maternal B. kazinoki of ZJGS. While compared with paternal B. papyrifera, ndhA, ndhD, ndhF, petB, rpl16, rpoA and ycf1 genes were positively selected. Among them, petD and rpl16 have the largest Ka/Ks ratio, and both are > 3. These two genes are highly evolved and can be used for subsequent gene identification.
In the phylogenetic tree based on 77 shared PCGs, ZJGS and its maternal B. kazinoki formed a single clade with high bootstrap support (99%) through three different methods (Fig. 8). All species are divided into four branches according to their evolutionary history. The first branch is the family of Moraceae, which was divided into three genera, namely, Broussonetia, Ficus and Morus. Here, Broussonetia and Ficus species are highly clustered with bootstrap values ≥ 99%. In addition, ZJGS, B. kazinoki and B. monoica formed a well-supported clade, as a natural hybrid, B. kazinoki also has the closest phylogenetic relationship with its maternal B. monoica. The second branch is Cannabaceae includes C. sativa and H. lupulus, which is closely to Moraceae. Studies have shown that both Cannabaceae and Moraceae belong to the Urticalean Rosid clade [44, 45]. The remaining two families, Rosaceae and Fagaceae, belong to Rosanae. In general, the classification results provide important information for the phylogenetic of ZJGS, as well as confirm the source of ZJGS hybridization.
In this study, we determined and analyzed the cp genome of ZJGS (a hybrid plant) and compared it with the Moraceae species for structural comparison and sequence alignment. From the structure composition of the cp genome, it can be seen that the seven Broussonetia species and the six Morus species are grouped independently and each of them have similar compositions. Among the seven Broussonetia species, total GC content of ZJGS cp genome sequence is the closest to the maternal B. kazinoki, shows high species affinity [41], which is also reflected in the comparison of sequence diversity and SSRs. In addition, the sequence comparison analysis of ZJGS cp genome and other 12 Moraceae plants at the genomic level reflects the high sequence conservation of the IRs regions, so the IRs regions changes can be used as a marker of species evolution [46, 47]. At the boundary of the IRs regions, the changes in the length of the rpl2, rps19, trnN and ycf1 reflect the contraction and expansion of the IRs regions. In the evolution of higher plants, all kinds of repetitive elements have a common feature, which tends to show “co-evolution”, which plays a crucial role in the genome sequence diversity and gene rearrangement [48, 49]. Studies have shown that the number of repeats is related to the degree of genome rearrangement [50–52]. Here, long repeat sequences in ZJGS cp genome are the largest, and the number of 20–30 bp repeats may promote the rearrangement of ZJGS [53], these repeats can serve as good molecular markers for species evolution, and play an important role in the variation of cp genome sequence.
The breeding process of ZJGS is long and complicated. It has undergone many cross-breeding and space mutation to form excellent plants with stable inheritance. And its maternal B. kazinoki has experienced controversy [15, 54], therefore, it is necessary to compare ZJGS and its hybrid parent systematically. Unlike the previous study [36], the cp genome of ZJGS was analyzed in detail. Through comparison, it is found that the total length and total GC content of ZJGS cp genome sequence (160,903 bp and 35.74%, respectively) are closest to its maternal B. kazinoki (160,841 bp and 35.73%, respectively), while its paternal B. papyrifera is 160,239 bp and 35.83%, respectively. This reflects a high degree of species affinity in the history of system evolution [41, 55]. Similarly, in the sequence diversity comparison, we also found that the cp genome sequence of ZJGS and maternal B. kazinoki remains highly consistence. This is related to sequence conservation and provides a basis for the species evolution of ZJGS [40, 56]. The construction of the phylogenetic tree shows us the most intuitive results, closely related species are grouped into the same branch. And ZJGS is closest to maternal B. kazinoki, which strongly supports the characteristics of cp maternal inheritance [57, 58].
As we have known before, when Ka/Ks > 1, the gene is strongly positive selection [42, 43]. We can screen some genes according to Ka/Ks ratio and then carry out functional studies, which have been commonly applied to the field of molecular evolution [59, 60]. In our research, three genes of atpF, ndhB and rpoC1 show a complete positive selection. In previous studies, some genes have been reported with a faster evolution rate, including ycf1, ycf2, accD, clpP, ndhA, rbcL, matK, ccsA, and cemA [22, 61]. Compared with two cross parents, the same genes (atpF, ndhA, ndhB, ndhD, petB and rpoC1) with positive selection were screened out. Most of the NADH genes (ndhA-ndhK) have a higher Ka/Ks ratio than photosynthesis genes (Photosystem I: psaA, psaB, psaC, psaI, psaJ; Photosystem II: psbA, psbB, psbC, psbD, psbE, psbF, psbH, psbI, psbJ, psbK, psbM, psbN, psbT, psbZ), which is similar to the previous research [55, 62], shows that photosynthesis genes have strong purification options. NADH genes have higher activity during cellular senescence and oxidative stress [63, 64], and early hybrid selection experiments showed that most proteins synthesized by chloroplasts in the early stages of aging are NDH polypeptides [65]. It can be seen that these genes under positive selection belong to the advantageous genes of hybrid plants, and may be the key to speciation.
This study is based on the cp genome of ZJGS, a hybrid species, whose structure and composition are the same as most angiosperms, and the IRs regions are highly conserved. The detailed comparison of 13 closely related Moraceae plants revealed the evolutionary characteristics of ZJGS cp genome: (1) The comparison of total GC content, sequence diversity, and SSRs all indicate that the cp genome of ZJGS has significant maternal inheritance; (2) The high sequence variability of the LSC region, the SSC region, and the intergenic regions, and the contraction and expansion of the IRs regions bounder leads to the changes in gene length; (3) Compared with Morus species, the Broussonetia species have a longer cp genome length, higher total GC content and SSRs number. And among all the Broussonetia plants, ZJGS is closest to maternal B. kazinoki. (4) Ka/Ks ratio reveals positive selection of genes such as atpF, ndhB and rpoC1, and NADH genes. Phylogenetic analysis supporting the close relationship between ZJGS and its hybrid parents (the maternal B. kazinoki and the paternal B. papyrifera). And it strongly supports the genetic relationship between ZJGS and maternal B. kazinoki. Our results provide a basis for overcoming phylogenetic problems at the species level, and also provide important genomic resources for the functional utilization of ZJGS.
Our research group first reported the complete cp genome of ZJGS and the sequence was submitted to Genebank with the accession number of MF496038. The more experiment details can be got from the announcement [36]. At the same time, 12 Moraceae plants, including six Broussonetia species and six Morus species were selected to compare cp genomes structure with ZJGS. The cp genome information of 13 Moraceae plants was downloaded from NCBI database, and the GenBank accession numbers were listed in Additional file 4: Table S4. Firstly, the characteristics of cp genomes, such as GC content, length of IRs regions and number of genes were compared manually. Secondly, mVista in Shuffle-LAGAN mode and Blast Ring Image Generator (BRIG) were employed to compare cp genomes of the above selected 13 plants and ZJGS was set as the reference [66, 67]. To explore the evolutionary event of ZJGS, the details of IRs junction regions were detected and displayed, for the IRs regions were considered as the most conserved regions and these junctions were regarded as an index of cp genome evolution.
SSRs, which is one of the most widely used molecular marker systems in plant genetic breeding and revealed more polymorphism [68–70]. The SSRs within the cp genome of each species were predicted by MISA with the parameters were set as: ≥ 10 for mono-nucleotides, ≥ 4 for di-nucleotides, ≥ 3 for tetra-nucleotides, penta-nucleotide and hexa-nucleotide. The number and details of SSRs were showed. Long repeat sequences, including forward match, reverse match, palindromic match and complementary match, within the cp genome, were identified by REPuter [71]. The related settings were showed as follows: (1) 90% or greater sequence identity; (2) a minimal repeat size of 20 bp. The graphs were all plotted using software Simplot 12.5.
The Ka/Ks ratio is a good indicator of selective pressure at the sequence level, which can calculate selective pressure within protein coding regions. Here, the Ka/Ks ratio of 77 shared protein-coding genes (PCGs) of 12 Moraceae plants cp genomes were compared with ZJGS. Firstly, the multiple nucleotide sequences of the homologous genes that code for proteins were aligned by MEGA 7 [72], and then the Ka/Ks ratio was calculated using DnaSP v5 [73].
In order to clarify the phylogenetic relationship of ZJGS, 27 related species were downloaded from the NCBI for phylogenetic analysis (the GenBank accession numbers were listed in Additional file 4: Table S4). The software MEGA 7 was used for multiple sequences alignment and evolutionary analysis [72]. Three methods were used to infer the evolutionary history and construct phylogenetic tree, including neighbor-joining method (NJ), maximum likelihood method (ML) and maximum parsimony method (MP). The evolutionary distances of NJ method were calculated using the Poisson correction method [74] and were in the units of the quantity of amino acid substitutions per site. In the ML method, initial tree was obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using JTT model, and then selecting the topology with superior log likelihood value. The MP tree was obtained using the Subtree-Pruning-Regrafting (SPR) algorithm with search level 1 in which the initial trees were obtained by the random addition of sequences (10 replicates). The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1,000 replicates) are shown next to the branches.
Broussonetia papyifera; bp:Base pairs; BRIG:Blast Ring Image Generator; cp:Chloroplast; CNR1:Cell Number Regulator 1 gene; DOGMA:Dual Organellar GenoMe Annotator; F. carica:Ficus carica; IRs:inverted repeats; Ka:the number of nonsynonymous substitutions per non-synonymous site; Ks:the number of synonymous substitutions per synonymous site; LSC:large single copy; ML:maximum likelihood method; MP:maximum parsimony method; M. atropurpurea:Morus atropurpurea; NJ:neighbor-joining method; NAD(P)H:Nicotinamide adenine dinucleotide (reduced); NDH:NAD(P)H dehydrogenase; PCGs:protein-coding genes; SSC:small single copy; rRNA:Ribosomal RNA; SSR:Simple sequence repeats; SPR:Subtree-Pruning-Regrafting ; tRNA:Transfer Ribonucleic Acid; ZJGS:Broussonetia kazinoki × Broussonetia papyifera.
Ethics approval and consent to participate
Not applicable
Consent for publication
Not applicable.
Availability of data and materials
All datasets supporting the conclusions of this article are included within the article (and its Additional files). All sequences can be downloaded from NCBI Genebank (https://www.ncbi.nlm.nih.gov/) with the accession number which has been mentioned in the article.
Competing interests
All the authors declare that they have no conflict of interest.
Funding
This research was funded by National Natural Science Foundation of China (31700332), Hunan Provincial Natural Science Foundation of China (2019JJ50027), Forestry science and technology project of Hunan Province (XKL201731), Science and Technology Project of Hunan Province (No: 2016TP2007, 2016TP1014, 2017TP2006), Postdoctoral Science Foundation project of Shaanxi Province (2016BSHEDZZ117) and Major Science and Technology Program of Hunan province (2017NK1014).
Authors' Contributions
YG and ZY are the lead investigator and conceived of the study. XZ and ZW wrote the paper and analyzed all the data. HH and ZJ is the lead investigator for the genome sequence. XZ and YG designed and fund for the current study. All authors have read and commented on the article.
Acknowledgements
Not applicable