Genomic features of P. granatum mitochondrial genome
The P. granatum mitochondrial genome was sequenced, assembled, and annotated, which is now archived in GenBank (accession number OQ973289, OQ973290, OQ973291, OQ973292, OQ973293, OQ973294 and OQ973295). The mitochondrial genome of P. granatum was assembled into 7 circular chromosomes with lengths ranging from 3,728 bp to 126,124 bp, with a total length of 382,774 bp (Fig. 1 and Fig. S1). The total GC content was 45.91%, ranging between 45.42% and 52.60% among chromosomes (Table S1). It encoded 74 genes, including 46 mitochondrial protein-coding genes, 25 tRNA genes, and three rRNA genes (rrn5, rrn18, and rrn26). The total GC contents of CDS, tRNA and rRNA genes were 42.35%, 50.85% and 52.14%, respectively (Table S1).
The protein-coding genes included 34 core genes and nine variable genes (Table 1). The core genes included five ATP synthase genes (atp1, atp4, atp6, atp8 and atp9), nine NADH dehydrogenase genes (nad1, nad2, nad3, nad4, nad4L, nad5, nad6, nad7 and nad9), four cytochrome c biogenesis genes (ccmB, ccmC, ccmFC and ccmFN), three cytochrome c oxidase genes (cox1, cox2 and cox3), one Maturases gene (matR), one membrane transport protein gene (mttB) and one ubichinol cytochrome c reductase gene (cob). The variable genes included four large subunits of ribosomal protein (rpl10, rpl16, rpl2 and rpl5), seven small subunits of ribosomal protein (rps1, rps3, rps4, rps7, rps12, rps14 and rps19), and one succinate dehydrogenase (sdh4). Among these, the genes atp9, mttB, nad2 and nad5 had two copies. The nad1, nad2, nad5 and nad7 genes contained four introns and nad4 gene had three introns, while ccmFC, cox2, rpl2 and rps3 had one intron. The trnC-GCA, trnM-CAT, trnP-TGG and trnS-TGA genes were multi-copy genes. The trnA-TGC, trnI-GAT and trnI-TAT genes contained one intron.
Table 1
Gene composition in the mitogenome of P. granatum
Group of Genes | Gene name | Length | Start codon | Stop codon | Amino acid |
ATP synthase | atp1 | 1536 | ATG | TAA | 512 |
| atp4 | 597 | ATG | TAG | 199 |
| atp6 | 747 | ATG | TGA | 249 |
| atp8 | 480 | ATG | TAA | 160 |
| atp9(2) | 225 | ATG | CGA(TGA) | 75 |
Cytohrome c biogenesis | ccmB | 621 | ATG | TGA | 207 |
| ccmC | 753 | ATG | TGA | 251 |
| ccmFc | 1356 | ATG | TAA | 452 |
| ccmFn* | 1737 | ATG | TAA | 579 |
Ubichinol cytochrome c reductase | cob | 1890 | ATG | TAA | 630 |
Cytochrome c oxidase | cox1 | 1065 | ATG | TAA | 355 |
| cox2* | 783 | ATG | TAA | 261 |
| cox3 | 798 | ATG | TGA | 266 |
Maturases | matR | 1926 | ATG | TAA | 642 |
Transport membrance protein | mttB(2) | 342 | ATG | TGA | 114 |
NADH dehydrogenase | nad1**** | 978 | ACG(ATG) | TAA | 326 |
| nad2(2)**** | 1467 | ATG | TAA | 489 |
| nad3 | 357 | ATG | TAA | 119 |
| nad4*** | 1488 | ATG | TGA | 496 |
| nad4L | 303 | ATG | TAA | 101 |
| nad5(2)**** | 2013 | ATG | TAA | 671 |
| nad6 | 618 | ATG | TAA | 206 |
| nad7**** | 1185 | ATG | TAG | 395 |
| nad9 | 573 | ATG | TAA | 191 |
Ribosomal proteins (LSU) | rpl10 | 489 | ATG | TAA | 163 |
| rpl16 | 249 | ATG | TAA | 83 |
| rpl2 | 1014 | ATG | TAA | 338 |
| rpl5 | 564 | ATG | TAA | 188 |
Ribosomal proteins (SSU) | rps1 | 624 | ATG | TAA | 208 |
| rps12* | 378 | ATG | TGA | 126 |
| rps14 | 303 | ATG | TAG | 101 |
| rps19 | 285 | ATG | TAA | 95 |
| rps3 | 1710 | ATG | TAA | 570 |
| rps4 | 1047 | ATG | TAA | 349 |
| rps7 | 447 | ATG | TAA | 149 |
Succinate dehydrogenase | sdh4 | 387 | ATG | CGA(TGA) | 129 |
Ribosomal RNAs | rrn18 | 1929 | - | - | - |
| rrn26 | 3403 | - | - | - |
| rrn5 | 121 | - | - | - |
Transfer RNAs | trnA-TGC* | 65 | - | - | - |
| trnC-GCA(2) | 71/73 | - | - | - |
| trnD-GTC | 74 | - | - | - |
| trnE-TTC | 72 | - | - | - |
| trnF-GAA | 74 | - | - | - |
| trnG-GCC | 72 | - | - | - |
| trnH-GTG | 74 | - | - | - |
| trnI-GAT* | 72 | - | - | - |
| trnI-TAT* | 76 | - | - | - |
| trnK-TTT | 73 | - | - | - |
| trnM-CAT(3) | 74/73/74 | - | - | - |
| trnN-GTT | 72 | - | - | - |
| trnP-TGG(3) | 74/75/75 | - | - | - |
| trnQ-TTG | 72 | - | - | - |
| trnS-GCT | 88 | - | - | - |
| trnS-TGA(2) | 93/87 | - | - | - |
| trnV-GAC | 72 | - | - | - |
| trnW-CCA | 74 | - | - | - |
| trnY-GTA | 83 | - | - | - |
Note: Numbers after gene names are the number of copies. The superscripts * represent the number of contained introns. |
Condon usage and RSCU analysis
The ATG was the most frequent start codon for protein-coding genes, while the nad1 gene was exception with the initiating codon ACG (ATG). The stop codons TAA, TAG, TGA, and CGA (TGA) were identified (Table 1). These results indicated that the C to U RNA editing phenomenon was found in the start or stop codons.
The relative synonymous codon usage (RSCU) analysis of P. granatum mitochondrial genome was shown in Fig. 2. It contained 10, 445 codons excluding termination codons in protein-coding genes regions. The most frequent codons used were UUU (Phe), AUU (Ile) and GAA (Glu) and were used>300 times, while CUG (Met), UUG (Met) and UAG (Ter) were rarely found (Table S2). We found that most RSCU values of the codons ending with A or T were higher than 1.0, while most of those ending with C or G had RSCU values of less than 1. Codon usage was generally strongly biased toward A or T (U) at the third codon position in the P. granatum mitogenome. Similar results had been found in the mitogenome of other plants [17, 24].
Repeat sequence analysis
A total of 188 pairs of repeats with a length greater than or equal to 30 bp were found, including 97 pairs of reverse complementary repeats, 91 pairs of forward repeats. The longest reverse complementary repeat was 355 bp, while the longest forward repeat was 24,984 bp (Table S3).
The simple repeated sequences (SSRs) with motifs of one to six bases are abundant in higher plants. In the study, a total of 141SSRs were detected in the P. granatum mitogenome, including 58 (41.13%) mono-, 19 (13.48%) di-, 18 (12.77%) tri-, 35(24.82%) tetra-, 9 (6.38%) penta-, and 2 (1.42%) hexanucleotide repeats (Fig. 3A). Among them, monomeric and dimeric SSRs accounted for 54.61% of the total SSRs. Mononucleotide repeats of A/T (34.75%) were more prevalent than the other repeat types, and dinucleotide repeats of AG/CT (7.80%) were the second most numerous.
Prediction of RNA editing sites
RNA editing events were identified for 36 unique PCGs based on online website predictions. There were 466 potential RNA editing sites distributed among all PCGs, and all of which were C-to-U base editing (Fig. 3B and Table S4). The nad4 gene was the most edited and possessed 39 potential RNA editing sites of the mitochondrial genes. This was followed by the ccmB gene with 36 RNA editing events. The edited number of rps1 gene was the lowest and had only one potential RNA editing events among all mitochondrial genes.
The total number of hydrophilic-hydrophobic type induced by RNA editing was 224 sites, which had the highest proportion at 48.07% (Table 2). The hydrophobic-hydrophobic and hydrophilic-hydrophilic type was 31.33% (146 sites) and 12.23% (57 sites), respectively. The hydrophilic-stop number was the lowest and the proportion was 1.07% (5 sites). Among them, there were 10 site conversions of CCT to TTT and 8 conversions of CCC to TTC.
Table 2
Prediction of RNA editing sites in P. granatum mitogenome
Type | RNA-editing | Number | Percentage |
hydrophilic-hydrophilic | CAC (H) = > TAC (Y) | 6 | |
| CAT (H) = > TAT (Y) | 16 | |
| CGC (R) = > TGC (C) | 6 | |
| CGT (R) = > TGT (C) | 29 | |
| total | 57 | 12.23% |
hydrophilic-hydrophobic | ACA (T) = > ATA (I) | 5 | |
| ACC (T) = > ATC (I) | 1 | |
| ACG (T) = > ATG (M) | 5 | |
| ACT (T) = > ATT (I) | 4 | |
| CGG (R) = > TGG (W) | 32 | |
| TCA (S) = > TTA (L) | 71 | |
| TCC (S) = > TTC (F) | 30 | |
| TCG (S) = > TTG (L) | 38 | |
| TCT (S) = > TTT (F) | 38 | |
| total | 224 | 48.07% |
hydrophilic-stop | CAA (Q) = > TAA (X) | 3 | |
| CGA (R) = > TGA (X) | 2 | |
| total | 5 | 1.07% |
hydrophobic-hydrophilic | CCA (P) = > TCA (S) | 6 | |
| CCC (P) = > TCC (S) | 8 | |
| CCG (P) = > TCG (S) | 3 | |
| CCT (P) = > TCT (S) | 17 | |
| total | 34 | 7.30% |
hydrophobic-hydrophobic | CCA (P) = > CTA (L) | 36 | |
| CCC (P) = > CTC (L) | 12 | |
| CCC (P) = > TTC (F) | 8 | |
| CCG (P) = > CTG (L) | 29 | |
| CCT (P) = > CTT (L) | 23 | |
| CCT (P) = > TTT (F) | 10 | |
| CTC (L) = > TTC (F) | 8 | |
| CTT (L) = > TTT (F) | 12 | |
| GCA (A) = > GTA (V) | 1 | |
| GCG (A) = > GTG (V) | 5 | |
| GCT (A) = > GTT (V) | 2 | |
| total | 146 | 31.33% |
| All | 466 | 100% |
Nucleotide diversity and comparative analysis of mitochondrial structure
The nucleotide diversity (pi) values of 38 regions were calculated and ranged from 0 to 0.10582, with an average of 0.03132 (Fig. 4A and Table S5). The pi value of gene10.nad9 region was highest in these regions, which was 0.07259 and 0.0698 in gene37.atp9 and gene33.rrn18 regions, respectively. The lower pi values suggested that the mitogenome sequences of P. granatum were highly conserved.
The mitogenome structures of P. granatum and its proximal species belonged to the Myrtales were comparative analyzed using CGVIEW software (Fig. 4B). There were high similarities in the mitochondrial structure of P. granatum and Eucalyptus grandis (NCBI Number: MG925370.1), Lagerstroemia indica (NCBI Number: KX641464.1) and Medinilla magnifica (NCBI Number: MT043351.1).
Phylogenetic and collinearity analysis
The mitochondrial genome of proximate species of pomegranate is rarely reported. To further explore the evolutionary relationships of P. granatum, the phylogenetic tree was constructed by maximum likelihood method (Fig. 5). Phylogenetic analysis was performed on 30 conserved mitochondrial PCGs (atp1, atp4, atp6, atp8, atp9, ccmC, ccmFc, ccmFn, cob, cox1, cox2, cox3, matR, nad1, nad2, nad3, nad4, nad4L, nad5, nad6, nad7, nad9, rpl5, rps1, rps3, rps4, rps12, rps14 and sdh4) from11 species. The mitochondrial genomes of Malus domestica, Vitis vinifera and Fragaria orientalis were set as outgroups. The phylogenetic tree strongly supports (100% bootstrap support) the close phylogenetic relationship between P. granatum and Lagerstroemia indica.
The collinearity analysis of Lagerstroemia indica, Eucalyptus grandis, Medinilla magnifica and P. granatum were performed with two methods (Figu.6). The mitogenome of P. granatum and Lagerstroemia indica had more collinear forward alignment and reverse complementary alignment sequences, which was followed by P. granatum and Eucalyptus grandis (Fig. 6A). There were many homologous collinear blocks between the P. granatum mitochondrial genome and other three proximate species (Fig. 6B). Some gaps illustrates that these sequences are unique to the species and have no homology with the rest of the species. The results suggest that the P. granatum mitochondrial genome has undergone a lot of genomic rearrangement with close species.
Intracellular gene transfer of P. granatum organelle genomes
The P. granatum mitogenome sequence was approximately 2.41 times longer than its cp genome (NCBI Number: MK603511, 1158,638 bp). According to sequence similarity analysis, there were 22 homologous fragments between the mitogenome and chloroplast genome, with a total length of 19,889 bp, accounting for 5.20% of the total mitogenome (Fig. 7 and Table S6). By annotating these homologous sequences, 11 complete genes were also found on 22 homologous fragments, including nine tRNA genes (trnD-GUC, trnH-GUG, trnI-GAU, trnM-CAU, trnP-UGG, trnS-UGA, trnV-GAC, trnW-CCA, trnN-GUU) and two rRNA genes (rrn5 and rrn16). There were 35 gene sequence of chloroplast transfer to mitochondria in P. granatum (Table S7). The length of atpA_len507 gene was the longest with a length of 471bp. The ndhA_len363 gene had the shortest length (35bp).