SSR and Long-Repeat Analysis
Simple sequence repeats (SSRs), also known as microsatellites, are tandem repeat sequences consisting of 1–6 nucleotide repeat units. They are widely distributed in cp genomes and often used as genetic markers in population genetics and evolutionary studies due to their high intraspecific variability (Ellis and Burke 2007). In this study, we analyzed the SSRs in the cp genomes of M. simplicifolia and six other Meconopsis species. A total of 33 SSRs were identified in the M. simplicifolia cp genome. Similarly, M. horridula, M. integrifolia, M. punicea, M. racemose, M. henrici, and M. quintuplinervia contained 38, 33, 34, 40, 23, and 35 SSRs, respectively (Fig. 3A). Mononucleotide repeats were the most abundant in all seven species. Dinucleotide repeats were the second most prevalent, with M. simplicifolia having four and M. racemose, M. horridula, and M. punicea having eight dinucleotide repeats. In M. simplicifolia, all mononucleotide repeats (100%) and the majority of dinucleotide repeats (75%) consisted of A/T nucleotides (Fig. 3B).
Repeat sequences play a crucial role in phylogenetic research and genome reorganization. In the cp genome of M. simplicifolia, 27 dispersed repeats were identified, including 10 forward repeats and 17 palindromic repeats (Fig. 4A). This pattern is consistent with the six other Meconopsis cp genomes, with the number of repeats ranging from 29 in M. punicea to 50 in M. integrifolia. Palindromic repeats were the most prevalent repeat type among the seven Meconopsis species (Fig. 4B, C). Most of these repeats ranged from 30 bp to 44 bp in length.
Comparative analysis of cp genomes of Meconopsis Species
Comparative analysis of cp genomes provided valuable insights into intricate evolutionary relationships. In this study, we compared the cp genomes of M. simplicifolia and six other Meconopsis species. The size of the seven Meconopsis cp genomes ranged from 151,864 (M. integrifolia) to 154,997 bp (M. quintuplinervia). Notably, within the genus Meconopsis, the genome of M. simplicifolia exhibited a higher degree of conservation and could be accurately mapped (Fig. 5). To assess the sequence identity of the Meconopsis cp genomes, we used the mVISTA software. The results revealed that the IR regions exhibited fewer differences compared to the LSC and SSC regions (Fig. 6). Non-coding regions displayed more variability than coding regions, with significant changes observed in intergenic spacers among the seven cp genomes. These highly divergent regions included trnH-psbA, matK, rps16-psbK, atpH-atpI, rpoC2, psbM-petN, trnE-trnT, trnT-psbD, psaA-ycf3, trnF-ndhJ, ndhK, ndhC-trnV, atpB-rbcL, accD, ycf4-cemA, petA-psbL, psbE-petL, clpP-psbB, rpl16, ndhF-rpl32-ccsA, ycf1, among others.
A detailed comparison of the binding regions between the inverted repeats (IR/LSC and IR/SSC) was performed among the seven species (Fig. 7). In all species, the rpl22 gene was located within the LSC region. Variations in gene content and order were observed, such as the presence of the ycf1 gene at the SSC/IRa junction in M. simplicifolia, M. horridula, M. punicea, M. racemose, M. henrici, and M. quintuplinervia, while M. integrifolia had a missing ycf1 gene in the SSC/IRa junction. Expansion and contraction of the inverted repeat region were observed. For example, the rps19 gene was found within the LSC region in M. racemose, while in the other six Meconopsis species, it was located 67–158 bp away, spanning the LSC and IRb binding regions. Except for M. simplicifolia, the rpl2 gene did not extend into the LSC region in the other species. Overall, there were only minor variations in the IR boundary regions among the cp genomes of these seven Meconopsis species.
Phylogenetic analysis of M. simplicifolia and related Meconopsis species cp genomes
To determine the phylogenetic position of M. simplicifolia within the Papaveraceae family, we utilized the cp genomes of ten Papaveraceae members, including M. simplicifolia. A species tree was constructed based on the alignment of 75 shared protein-coding genes from the cp genomes. The analysis revealed that M. simplicifolia clustered together with M. betonicifolia (Fig. 8). All Meconopsis plants formed a well-supported branch, indicating the high potential of cp genomes for species differentiation within the order Papaveraceae.