Mapping and classification of chloroplast genes inAmaryllidaceaeplants
By processing and analyzing the downloaded genebank files of various species in the Amaryllidaceae, a total of 63 chloroplast coil maps (Allium macranthum non chloroplast coil maps) were drawn, and then classified based on differences in rRNA and tRNA numbers, as well as GC content. Based on data analysis, we can classify chloroplast genes based on the number of rRNAs, tRNAs, GC content, and differences in chloroplast gene numbers among different species for data organization. According to the differences in chloroplast gene numbers among different species, it can be divided into 12 situations: gene number 68, gene number 69, gene number 78, gene number 81, gene number 82, gene number 83, gene number 84, gene number 85, gene number 86, gene number 87, gene number 88, gene number 89, and species with gene number 86 are the most, with 38 species accounting for approximately 59%; The species with the fewest gene numbers of 68, 69, 78, 81, and 82, all with only one species, are Allium paradoxum with 68 genes, Allium przewalskianum with 69 genes, Allium ferganicum with 78 genes, Allium ampeloprasum with 81 genes, and Hippeastrum rutilum with 82 genes, all accounting for approximately 2% (Supplementary Fig. 1). According to the data provided in 63 chloroplast coil maps of Amaryllidaceae plants, the number of rRNAs for each species was 8 (Supplementary Fig. 2).
According to the different number of tRNAs in each species, it can be divided into six situations, namely, the number of tRNAs is 31, 36, 37, 38, 39, 42; Among them, there is only one plant in the Amaryllidaceae with 31, 36, 37, and 39 tRNAs, with the lowest number. There are four plants in the Amaryllidaceae with 42 tRNAs, and 55 plants with 38 tRNAs, with the highest number. According to data analysis, among 63 fully sequenced chloroplast genomes of Amaryllidaceae plants, the vast majority have 38 tRNAs. Some species exhibit abnormal tRNAs due to the loss or duplication of tRNA genes, with Allium chinense having only 31 tRNAs, Clivia gardenii having only 36 tRNAs, Allium cyathophorum CMS-S having only 37 tRNAs, while Lycoris anhuiensis,Lycoris radiata,Hippeastrum vittatum,Zephyranthes phycelloides these four plants in the Amaryllidaceae, have 42 tRNAs (Supplementary Fig. 3).
According to the different GC contents in the chloroplast genome of Amaryllidaceae plants, it can be roughly divided into three situations: 36% ≤ GC content < 37%, 37% ≤ GC content < 38%, 38% ≤ GC content < 39%(Supplementary Table 1, Supplementary Fig. 4). The plant species in the family Amaryllidaceae with a GC content of 36% ≤ 37% are the most, with 39 species (Table 1). The chloroplast coil diagram of these 39 species in the Amaryllidaceae is shown in Fig. 1; There are 23 species of Amaryllidaceae plants with a GC content of less than 38% (Table 2), and the chloroplast coils of these 22 species are shown in Fig. 2; The species of plants in the Amaryllidaceae with a GC content of 38% ≤ 39% are the least, with only 2 species (Supplementary Table 2). The chloroplast circles of these 2 species of plants in the Amaryllidaceae are shown in Fig. 3.
The GC content of the chloroplast genome of these 64 species of Amaryllidaceae plants is approximately between 36% and 39%, while the AT content is approximately between 61% and 64%, indicating that the chloroplast genome codons prefer to use A/T bases.
Evolutionary Tree Analysis of Chloroplast Genomes in 64 Amaryllidaceae plants
The phylogenetic analysis of the chloroplast genome of plants in the Amaryllidaceae shows that the evolutionary relationship of the chloroplast genome can be divided into 12 groups (labeled clockwise as Group1, Group2, Group3, Group4, Group5, Group6, Group7, Group8, Group9, Group10, Group11, Group12). Overall, the members of Group1 are the most primitive and evolve the slowest, while Group12 evolves the fastest. Among Group 12, Lycoris longituba and Lycoris anhuiensis have the fastest evolutionary speed, while Narcissus poeticus has the slowest evolutionary speed. Group1 has 2 members, Group2 has 14 members, Group3 has 10 members, Group4 has 3 members, Group5 has 4 members, Group6 has 2 members, Group7 has 5 members, Group8 has 5 members, Group9 has 1 member, Group10 has 2 members, Group11 has 7 members, and Group12 has 9 members (Fig. 4). Based on this evolutionary tree, it can be found that Allium plants has a slower evolutionary speed, while Lycoris plants has the fastest evolutionary speed.
The evolutionary analysis of the plant gene atpA in the Amaryllidaceae shows that the evolutionary relationship of the chloroplast gene atpA in the Amaryllidaceae can be divided into 13 groups (labeled clockwise as Group1, Group2, Group3, Group4, Group5, Group6, Group7, Group8, Group9, Group10, Group11, Group12, Group13). Overall, the members of Group1 are the most primitive and evolve the slowest, while Group13 evolves the fastest. In Group13, Hippeastrum alberti atpA, Hippeastrum hybrid cultivar atpA, Hippeastrum reticulatum atpA, Hippeastrum vittatum atpA, Lycoris sprengeri atpA and Lycoris sanguinea atpA have the fastest evolution rates, while Clivia miniata atpA and Zephyranthes candida atpA have the slowest evolution rates. Group1 has 2 members, Group2 has 4 members, Group3 has 8 members, Group4 has 14 members, Group5 has 9 members, Group6 has 1 member, Group7 has 2 members, Group8 has 1 member, Group9 has 2 members, Group10 has 1 member, Group11 has 1 member, Group12 has 6 members, and Group13 has 11 members (Fig. 5-A).
The evolutionary analysis of the chloroplast gene atpB in plants of the Amaryllidaceae shows that the evolutionary relationship of the chloroplast gene atpB in the Amaryllidaceae can be divided into 13 groups (sequentially labeled as Group1-Group13). Overall, Group1 is the most primitive and has the slowest evolutionary speed, while Group13 has the fastest evolutionary speed. In Group 1, Allium oschaninii atpB and Allium polyrhizum CMS-S atpB have the fastest evolution speed, while Allium chinense atpB has the slowest evolution speed; In Group13, Lycoris anhuiensis atpB, Lycoris squamigera atpB, Lycoris radiata atpB, Lycoris longituba atpB and Lycoris chinensis atpB have the fastest evolution speed, while Clivia miniata atpB has the slowest evolution speed. Group1 has 3 members, Group2 has 14 members, Group3 has 4 members, Group4 has 12 members, Group5 has 1 member, Group6 has 7 members, Group7 has 2 members, Group8 has 1 member, Group9 has 1 member, Group10 has 1 member, Group11 has 1 member, Group12 has 5 members, and Group13 has 10 members (Fig. 5-B).
The evolutionary analysis of the chloroplast gene atpF in Amaryllidaceae shows that the evolutionary relationship of the chloroplast gene atpF in Amaryllidaceae can be divided into 12 groups (labeled as Group1, Group2, Group3, Group4, Group5, Group6, Group7, Group8, Group9, Group10, Group11, Group12). Overall, Group1 is the most primitive and has the slowest evolutionary speed, while Group12 has the fastest evolutionary speed. In Group 1, Allium chinense atpF and Allium herderianum CMS-S atpF have the fastest evolution speed, while Allium praemixtum atpF has the slowest evolution speed; In Group12, Lycoris aurea atpF, Lycoris anhuiensis atpF and Lycoris radiata atpF have the fastest evolution speed, while Lycoris sanguinea atpF has the slowest evolution speed. Group1 has 3 members, Group2 has 4 members, Group3 has 23 members, Group4 has 3 members, Group5 has 1 member, Group6 has 5 members, Group7 has 5 members, Group8 has 1 member, Group9 has 1 member, Group10 has 2 members, Group11 has 4 members, and Group12 has 10 members (Fig. 5-C).
The evolutionary analysis of the chloroplast gene atpI in Amaryllidaceae plants shows that the evolutionary relationship of the chloroplast gene atpI in Amaryllidaceae can be divided into 11 groups (labeled as Group1, Group2, Group3, Group4, Group5, Group6, Group7, Group8, Group9, Group10, Group11 in sequence). Overall, Group1 is the most primitive and has the slowest evolutionary speed, while Group11 has the fastest evolutionary speed. In Group 1, Allium ferganicum atpI, Allium ampeloprasum atpI, Allium sativum atpI have the fastest evolution speed, and Allium herderianum CMS-S atpI has the slowest evolution speed; In Group 11, Lycoris radiata atpI, Hippeastrum vittatum atpI, Lycoris anhuiensis atpI, Lycoris aurea atpI have the fastest evolution speed, Hippeastrum alberti atpI, Hippeastrum hybrid cultivar atpI, Zephyranthes candida atpI, Hippeastrum reticulatum atpI have the slowest evolution speed. Group1 has 4 members, Group2 has 22 members, Group3 has 7 members, Group4 has 2 members, Group5 has 1 member, Group6 has 10 members, Group7 has 1 member, Group8 has 5 members, Group9 has 1 member, Group10 has 1 member, and Group11 has 8 members (Fig. 5-D).
The atpA, atpB, atpF, and atpI genes are all photosynthetic genes belonging to the ATP synthase subunit. ATP synthase is the key enzyme of energy metabolism in organisms, widely exists in chloroplasts and mitochondria, and is the key enzyme of oxidative phosphorylation.
The evolutionary analysis of the chloroplast gene ccsA in Amaryllidaceae plants shows that the evolutionary relationship of the chloroplast gene ccsA in Amaryllidaceae can be divided into 14 groups (labeled as Group1-Group14). Overall, Group1 is the most primitive and has the slowest evolutionary speed, while Group14 has the fastest evolutionary speed. In Group 1, Allium polyrhizum CMS-S ccsA ccsA and Allium przewalskianum ccsA have the fastest evolution speed, while Allium caeruleum ccsA has the slowest evolution speed; In Group14, Lycoris longituba ccsA, Lycoris squamigera ccsA, and Lycoris anhuiensis ccsA have the fastest evolution speed, while Narcissus poeticus ccsA has the slowest evolution speed. Group1 has 3 members, Group2 has 4 members, Group3 has 10 members, Group4 has 1 member, Group5 has 10 members, Group6 has 1 member, Group7 has 3 members, Group8 has 1 member, Group9 has 9 members, Group10 has 2 members, Group11 has 1 member, Group12 has 1 member, Group13 has 7 members, and Group14 has 9 members (Fig. 6-A).
The evolutionary analysis of the chloroplast gene cemA in plants of the Amaryllidaceae family shows that the evolutionary relationship of the chloroplast gene cemA in the Amaryllidaceae family can be divided into 17 groups (sequentially labeled as Group1, Group2, Group3, Group4, Group5, Group6, Group7, Group8, Group9, Group10, Group11, Group12, Group13, Group14, Group15, Group16, Group17). Overall, Group1 is the most primitive and has the slowest evolutionary speed, while Group17 has the fastest evolutionary speed. In Group17, the evolution speed ofClivia miniata cemA and Hippeastrum rutilum cemA is the fastest, while Lycoris chinensis cemA, Lycoris sanguinea cemA, and Lycoris sprengeri cemA are the slowest. Group 1 has 2 members, Group 2 has 4 members, Group 3 has 6 members, Group 4 has 3 members, Group 5 has 14 members, Group 6 has 1 member, Group 7 has 1 member, Group 8 has 1 member, Group 9 has 2 members, Group 10 has 4 members, Group 11 has 1 member, Group 12 has 4 members, Group 13 has 1 member, Group 14 has 1 member, Group 15 has 3 members, Group 16 has 6 members, and Group 17 has 8 members (Fig. 6-B).
The evolutionary analysis of the chloroplast gene matK in Amaryllidaceae plants shows that the evolutionary relationship of the chloroplast gene matK in Amaryllidaceae can be divided into 14 groups (sequentially labeled as Group1-Group14). Overall, Group1 is the most primitive and has the slowest evolutionary speed, while Group17 has the fastest evolutionary speed. In Group1, Allium praemixtum matK and Allium oschaninii matK have the fastest evolution speed, while Allium chinense matK has the slowest evolution speed; In Group14, Lycoris chinensis matK and Lycoris radiata matK have the fastest evolution speed, while Lycoris aurea matK has the slowest evolution speed. Group1 has 4 members, Group2 has 6 members, Group3 has 14 members, Group4 has 5 members, Group5 has 3 members, Group6 has 4 members, Group7 has 1 member, Group8 has 6 members, Group9 has 1 member, Group10 has 1 member, Group11 has 1 member, Group12 has 1 member, Group13 has 7 members, and Group14 has 8 members (Fig. 6-C).
The evolutionary analysis of the chloroplast gene rbcL in Amaryllidaceae plants shows that the evolutionary relationship of the chloroplast gene rbcL in Amaryllidaceae can be divided into 13 groups (sequentially labeled as Group1, Group2, Group3, Group4, Group5, Group6, Group7, Group8, Group9, Group10, Group11, Group12, Group13). Overall, Group1 is the most primitive and has the slowest evolutionary speed, while Group13 has the fastest evolutionary speed. In Group13, Lycoris sanguinea rbcL,Lycoris longituba rbcL,Lycoris aurea rbcL,Lycoris anhuiensis rbcL have the fastest evolution speed, while Narcissus poeticus rbcL has the slowest evolution speed. Group1 has 1 member, Group2 has 7 members, Group3 has 14 members, Group4 has 9 members, Group5 has 7 members, Group6 has 1 member, Group7 has 1 member, Group8 has 4 members, Group9 has 1 member, Group10 has 1 member, Group11 has 1 member, Group12 has 6 members, and Group13 has 9 members (Fig. 6-D).
The ccsA gene exists in the c-type cytochrome synthase, and the protein encoded by ccsA is a well conserved protein; cemA is an envelope protein gene that guides the synthesis of envelope proteins [73–74]. Chloroplast outer envelope proteins play important roles in signal transduction, protein introduction, lipid biosynthesis and remodeling, ion exchange with numerous metabolites, plastid division, motility, and host defense [75]; The matK gene is a mature enzyme gene commonly used in various studies due to its specific accuracy at the species level. Nowadays, the matK gene has been widely used as a barcode for identifying species in angiosperms [76]. rbcL is a protein coding gene in the large subunit of ribose diphosphate carboxylase, which is a key enzyme in the light driven carbon assimilation pathway and exists in all photosynthetic organisms. It appeared about 300 million years ago, even before the beginning of oxygen evolution and photosynthesis, it was one of the most abundant proteins on Earth. Phosphate ribose carboxylase is composed of a large subunit and a small subunit, and rbcL is encoded by a single gene (rbcL) in highly polyploid chloroplasts [77].
The evolutionary analysis of the chloroplast gene petA in Amaryllidaceae plants shows that the evolutionary relationship of the chloroplast gene petA in Amaryllidaceae can be divided into 11 groups (labeled as Group1-Group11). Overall, Group1 is the most primitive and has the slowest evolutionary speed, while Group11 has the fastest evolutionary speed. In Group 11, Hippeastrum rutilum petA and Agapanthus coddii petA have the fastest evolutionary speed, while Lycoris aurea petA has the slowest evolutionary speed. Group1 has 1 member, Group2 has 5 members, Group3 has 11 members, Group4 has 14 members, Group5 has 1 member, Group6 has 1 member, Group7 has 5 members, Group8 has 4 members, Group9 has 2 members, Group10 has 6 members, and Group11 has 12 members (Fig. 7-A).
The evolutionary analysis of the chloroplast gene petG in Amaryllidaceae plants shows that the evolutionary relationship of the chloroplast gene petG in Amaryllidaceae can be divided into four groups (labeled as Group1, Group2, Group3, and Group4 in sequence). Overall, Group1 is the most primitive and has the slowest evolutionary speed, while Group4 has the fastest evolutionary speed. Group1 has 1 member, Group2 has 1 member, Group3 has 19 members, and Group4 has 41 members (Fig. 7-B).
The evolutionary analysis of the chloroplast gene petL in Amaryllidaceae plants shows that the evolutionary relationship of the chloroplast gene petL in Amaryllidaceae can be divided into four groups (labeled as Group1-Group4 in sequence). Overall, Group1 is the most primitive and has the slowest evolutionary speed, while Group4 has the fastest evolutionary speed. Group1 has 2 members, Group2 has 3 members, Group3 has 4 members, and Group4 has 53 members (Fig. 7-C).
The evolutionary analysis of the chloroplast gene petN in Amaryllidaceae plants shows that the evolutionary relationship of the chloroplast gene petN in Amaryllidaceae can be divided into three groups (labeled as Group1-Group3 in sequence). Overall, Group1 is the most primitive and has the slowest evolutionary speed, while Group3 has the fastest evolutionary speed. Group1 has 1 member, Group2 has 1 member, and Group3 has 60 members (Fig. 7-D).
petA, petG, petL, and petN genes are photosynthetic genes belonging to the cytochrome b/f complex subunit. Cytochrome f is encoded by the petA gene in the plastome and translated on the thylakoid membrane bound ribosome to produce precursor forms; The peptides encoded by petG are crucial for the assembly or stability of cytochrome bf complexes; The expression of petL gene will lead to the formation of stability of dimeric somatic cell pigment b/f complex; The petN gene is a photosynthetic gene belonging to the subunit of cytochrome b/f complex, which is located on the surface of the chloroplast matrix and participates in electron transfer [78]. Studies have shown that the absence of petG or petN genes can lead to the loss of bleaching phenotype, photosynthetic electron transfer, and photoautotrophy. The proteins encoded by these two genes are essential for the stability of membrane complexes [79].
The evolutionary analysis of the chloroplast gene psaA in Amaryllidaceae plants shows that the evolutionary relationship of the chloroplast gene psaA in Amaryllidaceae can be divided into 8 groups (labeled as Group1, Group2, Group3, Group4, Group5, Group6, Group7, Group8 in sequence). Overall, Group1 is the most primitive and has the slowest evolution speed; And Group8 has the fastest evolution speed. In Group 1, Allium polyrhizum CMS-S psaA,Allium cyathophorum CMS-S psaA,Allium forrestii psaA,Allium ampeloprasum psaA have the fastest evolution speed, while Allium strictum CMS-S psaA psaA and Allium trifurcatum psaA have the slowest evolution speed; In. Group1 has 8 members, Group2 has 23 members, Group3 has 2 members, Group4 has 1 member, Group5 has 10 members, Group6 has 1 member, Group7 has 6 members, and Group8 has 11 members (Fig. 8-A).
The evolutionary analysis of the chloroplast gene psaB in Amaryllidaceae plants shows that the evolutionary relationship of the chloroplast gene psaB in Amaryllidaceae can be divided into 11 groups (labeled as Group1-Group11 in sequence). Overall, Group1 is the most primitive and has the slowest evolutionary speed, while Group11 has the fastest evolutionary speed. In Group1, Allium monanthum CMS-S psaB and Allium forrestii psaB have the fastest evolution speed,while Allium pskemense psaB have the slowest evolution speed; In Group 11, Hippeastrum alberti psaB, Hippeastrum hybrid cultivar psaB, Hippeastrum reticulatum psaB, Hippeastrum rutilum psaB, Hippeastrum vittatum psaB, Lycoris sanguinea psaB, Lycoris sprengeri psaB, Zephyranthes candida psaB have the fastest evolution speed ,while Lycoris anhuiensis psaB, Lycoris chinensis psaB, Lycoris longituba psaB, Lycoris radiata psaB, Lycoris squamigera psaB have the slowest evolutionary speed. Group1 has 3 members, Group2 has 6 members, Group3 has 24 members, Group4 has 1 member, Group5 has 1 member, Group6 has 9 members, Group7 has 1 member, Group8 has 1 member, Group9 has 2 members, Group10 has 1 member, and Group11 has 13 members (Fig. 8-B).
The evolutionary analysis of the chloroplast gene psaC in Amaryllidaceae plants shows that the evolutionary relationship of the chloroplast gene psaC in Amaryllidaceae can be divided into three groups (labeled as Group1, Group2, and Group3 in sequence). Overall, Group1 is the most primitive and has the slowest evolutionary speed, while Group3 has the fastest evolutionary speed. In Group3, Allium neriniflorum psaC has the slowest evolution speed, with the remaining 59 members evolving the fastest. Group1 has 1 member, Group2 has 1 member, and Group3 has 60 members (Fig. 8-C).
The evolutionary analysis of the chloroplast gene psaI in Amaryllidaceae plants shows that the evolutionary relationship of the chloroplast gene psaI in Amaryllidaceae can be divided into six groups (labeled as Group1-Group6 in sequence). Overall, Group1 is the most primitive and has the slowest evolutionary speed, while Group6 has the fastest evolutionary speed. In Group 1, the Allium obliquum psaI and Allium schoenoprasoidesc psaI have the fastest evolution speed, while the Allium strictum CMS-S psaI has the slowest evolution speed. Group1 has 3 members, Group2 has 27 members, Group3 has 2 members, Group4 has 2 members, Group5 has 6 members, and Group6 has 22 members (Fig. 8-D).
psaA, psaB, psaC and psaI are all photosynthetic genes of photosystem subunit. Photosystem I is a macromolecular complex located in the thylakoid membrane of chloroplasts and cyanobacteria, which can catalyze the reduction of light driven ferredoxin and the oxidation of Plastocyanin [80–82].
The evolutionary analysis of the chloroplast gene psbA in Amaryllidaceae plants shows that the evolutionary relationship of the chloroplast gene psbA in Amaryllidaceae can be divided into five groups (labeled as Group1-Group5 in sequence). Overall, Group1 is the most primitive and has the slowest evolutionary speed, while Group5 has the fastest evolutionary speed. Group1 has 2 members, Group2 has 2 members, Group3 has 4 members, Group4 has 5 members, and Group5 has 49 members (Fig. 9-A).
The evolutionary analysis of the chloroplast gene psbB in Amaryllidaceae plants shows that the evolutionary relationship of the chloroplast gene psbB in Amaryllidaceae can be divided into four groups (labeled as Group1-Group4 in sequence). Overall, Group1 is the most primitive and has the slowest evolutionary speed, while Group4 has the fastest evolutionary speed. In Group 1, Allium kingdonii psbB, Allium macranthum psbB, Allium monanthum CMS-S psbB have the fastest evolution speed, while Allium forrestii psbB has the slowest evolution speed. Group1 has 6 members, Group2 has 9 members, Group3 has 19 members, and Group4 has 28 members (Fig. 9-B).
The evolutionary analysis of chloroplast gene psbC in Amaryllidaceae shows that the evolutionary relationship of chloroplast gene psbC in Amaryllidaceae can be divided into 8 groups (Group1-Group7, group8). On the whole, group1 is the most primitive, the slowest in evolution, and group8 is the fastest in evolution. In group8, AlAlium plurifoliatum psbC, Allium paepalanthoides psbC, Allium oschaninii psbC, Allium obliquum psbC, Allium maowenense CMS-S psbC, Allium mairei psbC, Allium herderianum CMS-S psbC, Allium galanthum psbC, Allium forrestii psbC, Allium fistulosum psbC, Allium ferganicum psbC, Allium cyaneum psbC, Allium chrysocephalum CMS-S psbC, Allium chrysanthum CMS-S psbC, Allium chinense psbC, Allium changduense psbC, Allium cepa psbC, Allium caeruleum psbC, Allium ampeloprasum psbC, Allium altaicum psbC, Allium polyrhizum CMS-S psbC, Allium praemixtum psbC, Allium pskemense psbC, Allium rude CMS-S psbC, Allium sativum psbC, Allium sikkimense psbC, Allium spicatum psbC, Allium strictum CMS-S psbC, Allium trifurcatum psbC, Allium tuberosum psbC, Allium xichuanense CMS-S psbC is the fastest, and that of Allium przewalskianum psbC is the slowest. Group1 has 1 member, group2 has 4 members, group3 has 3 members, group4 has 2 members, group5 has 1 member, group6 has 7 members, group7 has 10 members, and group8 has 34 members (Fig. 9-C).
The evolutionary analysis of chloroplast gene psbD in Amaryllidaceae shows that the evolutionary relationship of chloroplast gene psbD in Amaryllidaceae can be divided into five groups (group1, group2, group3, group4, group5). On the whole, group1 is the most primitive, the slowest in evolution, and group5 is the fastest in evolution. Group1 has 2 members, group2 has 30 members, group3 has 12 members, group4 has 4 members, and group5 has 14 members (Fig. 9-D).
The evolutionary analysis of chloroplast gene psbF in Amaryllidaceae shows that the evolutionary relationship of chloroplast gene psbF in Amaryllidaceae can be divided into four groups (group1, group2, group3, group4). On the whole, group1 is the most primitive, the slowest in evolution, and group4 is the fastest in evolution. Group1 has 2 members, group2 has 1 member, group3 has 17 members, and group4 has 42 members (Fig. 9-E).
The evolutionary analysis of chloroplast gene psbI in Amaryllidaceae showed that the evolutionary relationship of chloroplast gene psbI in Amaryllidaceae could be divided into four groups (group1, group2, group3, group4). On the whole, group1 is the most primitive, the slowest in evolution, and group4 is the fastest in evolution. Group1 has 1 member, group2 has 4 members, group3 has 2 members, and group4 has 55 members (Fig. 9-F).
The evolutionary analysis of chloroplast gene psbJ in Amaryllidaceae shows that the evolutionary relationship of chloroplast gene psbJ in Amaryllidaceae can be divided into six groups (group1, group2, group3, group4, group5, group6). On the whole, group1 is the most primitive, the slowest in evolution, and group6 is the fastest in evolution. Group1 has 3 members, group2 has 4 members, group3 has 6 members, group4 has 12 members, group5 has 4 members, and group6 has 33 members (Fig. 9-G).
The evolutionary analysis of chloroplast gene psbK in Amaryllidaceae shows that the evolutionary relationship of chloroplast gene psbK in Amaryllidaceae can be divided into five groups (group1, group2, group3, group4, group5). On the whole, group1 is the most primitive, the slowest in evolution, and group5 is the fastest in evolution. In group1, the evolution speed of Allium maowenense CMS-S psbK and Allium herderianum CMS-S psbK are the fastest, and the evolution speed of Allium ferganicum psbK is the slowest. Group1 has 4 members, group2 has 11 members, group3 has 17 members, group4 has 2 members, and group5 has 28 members (Fig. 9-H).
The evolutionary analysis of chloroplast gene psbL in Amaryllidaceae shows that the evolutionary relationship of chloroplast gene psbL in Amaryllidaceae can be divided into three groups (group1, group2, group3). On the whole, group1 is the most primitive, the slowest in evolution, and group3 is the fastest in evolution. Group1 has 1 member, group2 has 1 member, and group3 has 60 members (Fig. 9-I).
psbA, psbB, psbC, psbD, psbF, psbI, psbJ, psbK and psbL are photosynthetic genes of photosystem II subunits. PSII is a large pigment protein complex embedded in the thylakoid membrane. Its main function is to photoinduce water oxidation and transfer the extracted electrons to PQ molecules in the lipid phase of the membrane. The core of PSII is composed of heterodimers of D1 and D2 proteins, which connect the redox active components of PSII. psbA is a gene encoding D1 protein; The protein encoded by psbB gene may play a regulatory role in the assembly of PSII; psbC is a gene encoding chlorophyll (Chl) binding protein CP43; psbD is a gene encoding D2 protein; Cytochrome (cyt) b559, a double subunit redox active protein component, is also indispensable for PSII assembly and photoprotection. Cytb559 is encoded by psbE and psbF genes respectively α and β Subunit composition; psbI is a low molecular weight subunit of photosystem II (PSII), which can efficiently form and stabilize PSII dimer in vivo. PsbI is required in the assembly process of PSII dimer in vivo. Once the dimer is formed, psbI is no longer required to maintain its stability; psbJ is a single transmembrane close to quinone QB α Helix binding site; The amino acid sequence of the protein encoded by psbL indicates that it has a single transmembrane α The helical domain is very important for the operation of PSII [83–90].
The evolutionary analysis of chloroplast gene rpl2 in Amaryllidaceae showed that the evolutionary relationship of chloroplast gene rpl2 in Amaryllidaceae could be divided into 8 groups (group1, group2, group3, group4, group5, group6, group7, group8). On the whole, group1 is the most primitive, the slowest in evolution, and group8 is the fastest in evolution. In group1, Allium ferganicum rpl2_1, Allium ferganicum rpl2_2, Allium ampeloprasum rpl2_1, Allium ampeloprasum rpl2_2, Allium sativum rpl2_1, Allium sativum rpl2_2 are the fastest evolution, the evolution speed of the Allium maowenense CMS-S rpl2_1 and Allium maowenense CMS-S rpl2_2 are the slowest; In group8, Zephyranthes candida rpl2_1, Zephyranthes candida rpl2_2, Lycoris sprengeri rpl2_1, Lycoris sprengeri rpl2_2, Lycoris sanguinea rpl2_1, Lycoris sanguinea rpl2_2, Lycoris longituba rpl2_1, Lycoris longituba rpl2_2, Lycoris chinensis rpl2_1, Lycoris chinensis rpl2_2 are the fastest evolution, Narcissus poeticus rpl2_1, Narcissus poeticus rpl2_2, Hippeastrum rutilum rpl2_1, Hippeastrum rutilum rpl2_2 are the slowest evolution. Group1 has 8 members, group2 has 44 members, group3 has 2 members, group4 has 6 members, group5 has 28 members, group6 has 2 members, group7 has 16 members, and group8 has 18 members (Fig. 10-A).
The evolutionary analysis of chloroplast gene rpl14 in Amaryllidaceae shows that the evolutionary relationship of chloroplast gene rpl14 in Amaryllidaceae can be divided into five groups (group1, group2, group3, group4, group5). On the whole, group1 is the most primitive, the slowest in evolution, and group5 is the fastest in evolution. Group1 has 1 member, group2 has 1 member, group3 has 3 members, group4 has 13 members, and group5 has 44 members (Fig. 10-B).
The evolutionary analysis of chloroplast gene rpl20 in Amaryllidaceae showed that the evolutionary relationship of chloroplast gene rpl20 in Amaryllidaceae could be divided into 12 groups (group1, group2, group3, group4, group5, group6, group7, group8, Group9, group10, group11, group12). On the whole, group1 is the most primitive, the slowest in evolution, and group12 is the fastest in evolution. In group 12, Hippeastrum alberti rpl20, Hippeastrum hybrid cultivar rpl20, Hippeastrum reticulatum rpl20, Hippeastrum vittatum rpl20, Zephyranthes phycelloides rpl20 have the fastest evolution speed, and Hippeastrum rutilum rpl20 has the slowest evolution speed. Group1 has 2 members, group2 has 6 members, group3 has 13 members, group4 has 13 members, group5 has 5 members, group6 has 5 members, group7 has 1 member, group8 has 1 member, Group9 has 3 members, group10 has 2 members, group11 has 3 members, and group12 has 8 members (Fig. 10-C).
The evolutionary analysis of chloroplast gene rpl32 in Amaryllidaceae shows that the evolutionary relationship of chloroplast gene rpl32 in Amaryllidaceae can be divided into six groups (group1, group2, group3, group4, group5, group6). On the whole, group1 is the most primitive, the slowest in evolution, and group6 is the fastest in evolution. Group1 has 1 member, group2 has 14 members, group3 has 20 members, group4 has 8 members, group5 has 7 members, and group6 has 12 members (Fig. 10-D).
The evolutionary analysis of chloroplast gene rpl33 in Amaryllidaceae showed that the evolutionary relationship of chloroplast gene rpl33 in Amaryllidaceae could be divided into four groups (group1, group2, group3, group4). On the whole, group1 is the most primitive, the slowest in evolution, and group4 is the fastest in evolution. In group1, the evolution speed of Allium rude CMS-S rpl33, Allium herderianum CMS-S rpl33, Allium chrysanthum CMS-S rpl33 are the fastest, and the evolution speed of Allium monanthum CMS-S rpl33 is the slowest; In group4, Hippeastrum alberti rpl33, Hippeastrum hybrid cultivar rpl33, Hippeastrum reticulatum rpl33, Hippeastrum vittatum rpl33, Lycoris anhuiensis rpl33, Lycoris aurea rpl33, Lycoris chinensis rpl33, Lycoris longituba rpl33, Lycoris radiata rpl33, Lycoris sanguinea rpl33, Lycoris sprengeri rpl33, Lycoris squamigera rpl33, Zephyranthes phycelloides rpl33 have the fastest evolution speed, Allium macranthum rpl33 has the slowest evolution. Group1 has 9 members, group2 has 17 members, group3 has 12 members, and group4 has 24 members (Fig. 10-E). rpl2, rpl14, rpl20, rpl32 and rpl33 are protein coding genes of ribosomal large subunits. ribosomal large subunits and small subunits cooperate with each other to transform mRNA into polypeptide chains during protein synthesis.
The evolutionary analysis of chloroplast gene rpoA in Amaryllidaceae shows that the evolutionary relationship of chloroplast gene rpoA in Amaryllidaceae can be divided into 8 groups (group1, group2, group3, group4, group5, group6, group7, group8). On the whole, group1 is the most primitive, the slowest in evolution, and group8 is the fastest in evolution. In group8, Lycoris anhuiensis rpoA and Lycoris longituba rpoA have the fastest evolution speed, and Hippeastrum rutilum rpoA has the slowest evolution speed. Group1 has 2 members, group2 has 21 members, group3 has 3 members, group4 has 6 members, group5 has 7 members, group6 has 5 members, group7 has 2 members, and group8 has 16 members (Fig. 11-A).
The evolutionary analysis of chloroplast gene rpoB in Amaryllidaceae shows that the evolutionary relationship of chloroplast gene rpoB in Amaryllidaceae can be divided into 11 groups (group1, group2, group3, group4, group5, group6, group7, group8, Group9, group10, group11). On the whole, group1 is the most primitive, the slowest in evolution, and group11 is the fastest in evolution. In group1,Allium cyathophorum CMS-S rpoB and Allium spicatum rpoB have the fastest evolution speed, and Allium polyrhizum CMS-S rpoB has the slowest evolution speed; In group 11, Lycoris aurea rpoB and Lycoris radiata rpoB have the fastest evolution speed, and Hippeastrum rutilum rpoB has the slowest evolution speed. Group1 has 7 members, group2 has 8 members, group3 has 5 members, group4 has 12 members, group5 has 11 members, group6 has 1 member, group7 has 1 member, group8 has 1 member, Group9 has 1 member, group10 has 6 members, and group11 has 9 members (Fig. 11-B).
The evolutionary analysis of chloroplast gene rpoC1 in Amaryllidaceae shows that the evolutionary relationship of chloroplast gene rpoC1 in Amaryllidaceae can be divided into 12 groups (group1, group2, group3, group4, group5, group6, group7, group8, Group9, group10, group11, group12). On the whole, group1 is the most primitive, the slowest in evolution, and group12 is the fastest in evolution. In group 12, Lycoris anhuiensis rpoC1, Lycoris radiata rpoC1, Lycoris squamigera rpoC1 have the fastest evolution speed, and Hippeastrum rutilum rpoC1 has the slowest evolution speed. Group1 has 3 members, group2 has 4 members, group3 has 5 members, group4 has 6 members, group5 has 2 members, group6 has 12 members, group7 has 2 members, group8 has 5 members, Group9 has 5 members, group10 has 1 member, group11 has 6 members, and group12 has 11 members (Fig. 11-C).
The evolutionary analysis of chloroplast gene rpoC2 in Amaryllidaceae shows that the evolutionary relationship of chloroplast gene rpoC2 in Amaryllidaceae can be divided into 12 groups (group1, group2, group3, group4, group5, group6, group7, group8, Group9, group10). On the whole, group1 is the most primitive, the slowest in evolution, and group10 is the fastest in evolution. In group10, Lycoris anhuiensis rpoC2 and Lycoris longituba rpoC2 have the fastest evolution speed, and Narcissus poeticus rpoC2 has the slowest evolution speed. Group1 has 1 member, group2 has 14 members, group3 has 4 members, group4 has 10 members, group5 has 3 members, group6 has 12 members, group7 has 1 member, group8 has 1 member, Group9 has 7 members, and group10 has 9 members (Fig. 11-D).
rpoA, rpoB, rpoC1 and rpoC2 are protein coding genes of RNA polymerase. RNA polymerase has the function of catalyzing nucleotide to produce nucleic acid through polymerization.
The evolutionary analysis of chloroplast gene rps3 in Amaryllidaceae showed that the evolutionary relationship of chloroplast gene rps3 in Amaryllidaceae could be divided into 14 groups (group1, group2, group3, group4, group5, group6, group7, group8, Group9, group10, group11, group12, group13, group14). On the whole, group1 is the most primitive, the slowest in evolution, and group14 is the fastest in evolution. In group14, Hippeastrum alberti rps3, Hippeastrum hybrid cultivar rps3, Hippeastrum vittatum rps3 have the fastest evolution speed, and Lycoris aurea rps3 has the slowest evolution speed. Group1 has 1 member, group2 has 2 members, group3 has 9 members, group4 has 9 members, group5 has 12 members, group6 has 4 members, group7 has 1 member, group8 has 5 members, Group9 has 1 member, group10 has 1 member, group11 has 2 members, group12 has 2 members, group13 has 5 members, and group14 has 8 members (Fig. 12-A).
The evolutionary analysis of chloroplast gene rps4 in Amaryllidaceae shows that the evolutionary relationship of chloroplast gene rps4 in Amaryllidaceae can be divided into seven groups (group1, group2, group3, group4, group5, group6, group7). On the whole, group1 is the most primitive, the slowest in evolution, and group7 is the fastest in evolution. In group7, Allium altaicum rps4, Allium caeruleum rps4, Allium cepa rps4, Allium chrysanthum CMS-S rps4, Allium chrysocephalum CMS-S rps4, Allium ferganicum rps4, Allium fistulosum rps4, Allium galanthum rps4, Allium rude CMS-S rps4, Allium schoenoprasoides rps4, Allium xichuanense CMS-S rps4 have the fastest evolution speed.Fast, the evolution speed of Allium fasciculatum rps4 is the slowest. Group1 has 2 members, group2 has 2 members, group3 has 15 members, group4 has 3 members, group5 has 4 members, group6 has 18 members, and group7 has 18 members (Fig. 12-B).
The evolutionary analysis of chloroplast gene rps8 in Amaryllidaceae shows that the evolutionary relationship of chloroplast gene rps8 in Amaryllidaceae can be divided into 13 groups (group1, group2, group3, group4, group5, group6, group7, group8, Group9, group10, group11, group12, group 13). On the whole, group1 is the most primitive, the slowest in evolution, and group13 is the fastest in evolution. In group13, Clivia miniata rps8, Lycoris anhuiensis rps8, Lycoris aurea rps8, Lycoris longituba rps8, Lycoris sanguinea rps8, Lycoris sprengeri rps8, Lycoris squamigera rps8, Zephyranthes candida rps8 have the fastest evolution speed, and Hippeastrum reticulatum rps8 has the slowest evolution speed. Group1 has 1 member, group2 has 18 members, group3 has 11 members, group4 has 2 members, group5 has 1 member, group6 has 1 member, group7 has 5 members, group8 has 2 members, Group9 has 3 members, group10 has 1 member, group11 has 3 members, group12 has 3 members, and group13 has 11 members (Fig. 12-C).
The evolutionary analysis of chloroplast gene rps11 in Amaryllidaceae showed that the evolutionary relationship of chloroplast gene rps11 in Amaryllidaceae could be divided into 10 groups (group1, group2, group3, group4, group5, group6, group7, group8, Group9, group10). On the whole, group1 is the most primitive, the slowest in evolution, and group10 is the fastest in evolution. In group1, Allium ampeloprasum rps11 and Allium sativum rps11 have the fastest evolution speed, and Allium pskemense rps11 has the slowest evolution speed; In group10, Lycoris anhuiensis rps11, Lycoris longituba rps11, Lycoris squamigera rps11, Hippeastrum rutilum rps11, Lycoris chinensis rps11, Lycoris radiata rps11 have the fastest evolution speed, Clivia miniata rps11, Lycoris aurea rps11, Lycoris sanguinea rps11, Lycoris sprengeri rps11 have the slowest evolution speed. Group1 has 3 members, group2 has 17 members, group3 has 8 members, group4 has 1 member, group5 has 10 members, group6 has 3 members, group7 has 2 members, group8 has 1 member, Group9 has 7 members and group10 has 10 members (Fig. 12-D).
The evolutionary analysis of chloroplast gene rps12 in Amaryllidaceae shows that the evolutionary relationship of chloroplast gene rps12 in Amaryllidaceae can be divided into four groups (group1, group2, group3, group4). On the whole, group1 is the most primitive, the slowest in evolution, and group4 is the fastest in evolution. Group1 has 2 members, group2 has 36 members, group3 has 23 members, and group4 has 62 members (Fig. 12-E).
The evolutionary analysis of chloroplast gene rps15 in Amaryllidaceae showed that the evolutionary relationship of chloroplast gene rps15 in Amaryllidaceae could be divided into 9 groups (group1, group2, group3, group4, group5, group6, group7, group8, Group9). On the whole, group1 is the most primitive, the slowest in evolution, and Group9 is the fastest in evolution. In Group9, Lycoris anhuiensis rps15, Lycoris aurea rps15, Lycoris longituba rps15, Lycoris squamigera rps15, Narcissus poeticus rps15 have the fastest evolution speed, and Zephyranthes candida rps15 has the slowest evolution speed. Group1 has 1 member, group2 has 11 members, group3 has 22 members, group4 has 9 members, group5 has 1 member, group6 has 2 members, group7 has 1 member, group8 has 2 members, and Group9 has 14 members (Fig. 12-F).
The evolutionary analysis of chloroplast gene rps18 in Amaryllidaceae shows that the evolutionary relationship of chloroplast gene rps18 in Amaryllidaceae can be divided into six groups (group1, group2, group3, group4, group5, group6). On the whole, group1 is the most primitive, the slowest in evolution, and group6 is the fastest in evolution. Group1 has 2 members, group2 has 5 members, group3 has 7 members, group4 has 2 members, group5 has 20 members, and group6 has 26 members (Fig. 12-G).
rps3, rps4, rps8, rps11, rps12, rps15 and rps18 are protein coding genes on small ribosomal subunits, which are responsible for information recognition during ribosomal translation.
The evolutionary analysis of the chloroplast gene ycf1 in Amaryllidaceae plants shows that the evolutionary relationship of the ycf1 gene can be divided into 14 groups (labeled as Group1, Group2, Group3, Group4, Group5, Group6, Group7, Group8, Group9, Group10, Group11, Group12, Group13, Group14). Overall, Group1 is the most primitive and has the slowest evolutionary speed, while Group14 has the fastest evolutionary speed. In Group1, Allium fasciculatum ycf1_2 and Allium monanthum CMS-S ycf1_2. The fastest evolution speed, Allium fetisowii ycf1_1. The slowest evolution speed; In Group14, Allium chrysanthum CMS-S ycf1_2 and Allium rude CMS-S ycf1_2. The fastest evolution speed, Allium obliquum ycf1_1 and Allium strictum CMS-S ycf1. The slowest evolution speed. Group1 has 5 members, Group2 has 8 members, Group3 has 1 member, Group4 has 1 member, Group5 has 4 members, Group6 has 1 member, Group7 has 1 member, Group8 has 26 members, Group9 has 8 members, Group10 has 3 members, Group11 has 3 members, Group12 has 13 members, Group13 has 15 members, and Group14 has 18 members (Fig. 13-A).
The evolutionary analysis of the chloroplast gene ycf3 in Amaryllidaceae plants shows that the evolutionary relationship of the chloroplast gene ycf3 in Amaryllidaceae can be divided into 10 groups (labeled as Group1, Group2, Group3, Group4, Group5, Group6, Group7, Group8, Group9, Group10 in sequence). Overall, Group1 is the most primitive and has the slowest evolutionary speed, while Group10 has the fastest evolutionary speed. In Group 1, Allium nanodes ycf3, Allium prattii ycf3, and Allium victorialis ycf3 have the fastest evolution speed, while Allium polyrhizum CMS-S ycf3 has the slowest evolution speed. Group1 has 4 members, Group2 has 5 members, Group3 has 2 members, Group4 has 1 member, Group5 has 1 member, Group6 has 3 members, Group7 has 5 members, Group8 has 8 members, Group9 has 3 members, and Group10 has 30 members (Fig. 13-B).
The evolutionary analysis of the chloroplast gene ycf4 in Amaryllidaceae plants shows that the evolutionary relationship of the chloroplast gene ycf4 in Amaryllidaceae can be divided into 13 groups (sequentially labeled as Group1, Group2, Group3, Group4, Group5, Group6, Group7, Group8, Group9, Group10, Group11, Group12, Group13). Overall, Group1 is the most primitive and has the slowest evolutionary speed, while Group13 has the fastest evolutionary speed. Group1 has 1 member, Group2 has 6 members, Group3 has 1 member, Group4 has 6 members, Group5 has 2 members, Group6 has 2 members, Group7 has 10 members, Group8 has 6 members, Group9 has 3 members, Group10 has 2 members, Group11 has 2 members, Group12 has 6 members, and Group13 has 15 members (Fig. 13-C). ycf1, ycf3, and ycf4are all unknown functional genes of the open reading frame, and their functions and effects are not yet known.
Codon preference analysis of chloroplast genome in Amaryllidaceae Plants
The chloroplast genomes of 64 Amaryllidaceae plants downloaded from NCBI were statistically analyzed using CodonW software for their codon composition and preference (Table 3). The range of synonymous codon numbers (L_sym) in the chloroplast genomes of 64 species of Amaryllidaceae plants ranges from 44299 to 49152, with an average of 47090. Among them, the species with the highest number of synonymous codons is Hippeastrum rutilum, with a synonymous codon number of 49152. The species with the lowest number of synonymous codons is Allium paradoxum, with a synonymous codon number of 44299; The length (L_aa) range of 64 amino acid sequences encoded by the chloroplast genomes of 64 species of Amaryllidaceae plants is 45776–50950, with an average length of 48671. Among them, the longest amino acid sequence is Hippeastrum rutilum of the Hippeastrum genus, with a length of 50950, and the shortest amino acid sequence is Allium paradoxum of the Allium genus, with a length of 45776; The average hydrophilicity of amino acids (Gravey) reflects the influence of preference for the use of hydrophilic codons in amino acids. The amino acid hydrophilicity levels of the chloroplast genomes of 64 species of Amaryllidaceae plants range from − 0.145 to -0.024, with an average amino acid hydrophobicity level of -0.081. The Gravey values of the chloroplast genomes of 64 species of Amaryllidaceae plants are all negative, indicating that the chloroplast genomes of Amaryllidaceae plants are all hydrophilic egg white; The variation range of G + C content in the third position of synonymous codon is 0.34 ~ 0.37, with an average value of 0.36. The GC3s content is less than 0.5, indicating that the chloroplast genome of Amaryllidaceae plants prefers to use synonymous codons ending in A/U; The variation range of total G + C content in chloroplast genomes of different species is 0.38–0.39, with an average value of 0.38. The GC content is less than 0.5, indicating that the GC content in chloroplast genomes of Amaryllidaceae plants is less than the AT content, indicating weak codon preference; The frequency changes of thymine T3s, cytosine C3s, adenine A3s, and guanine G3s in the third position of the synonymous codon range from 0.39 to 0.42, 0.23 to 0.25, 0.24 to 0.42, and 0.22 to 0.24, respectively; The codon preference index (CBI) reflects the composition of highly expressed superior codons in a gene, and has a good correlation with Nec. The CBI range of the chloroplast genomes of these 64 species of Amaryllidaceae plants is -0.102~-0.079, with an average value of -0.092. Among them, Agapanthus coddii has the highest CBI value with a value of -0.079, while Allium oschaninii and Allium pskemense have the lowest CBI value with both values of -0.102; The number of effective codons (Nec) reflects the degree to which codons deviate from random selection and is an important indicator of the degree to which synonymous codon imbalanced usage preferences are reflected. Typically, highly expressed genes have a higher degree of codon preference and a lower Nec value. The range of Nec in the chloroplast genome of these 64 plants in the Amaryllidaceae family is between 54.64 and 56.16, with an average value of 55.41. The Nec value is relatively high, indicating weak codon usage bias in the chloroplast genome of Amaryllidaceae plants. Lycoris anhuiensis has the highest Nec value, 56.16 Nec, Allium altaicum has the lowest Nec value, and 54.64 Nec; The Codon Adaptation Index (CAI) can be used to measure the codon preference of a gene, and is often widely used to predict gene expression levels. The CAI values of the chloroplast genomes of these 64 species of plants in the Amaryllidaceae family range from 0.154 to 0.162, with an average of 0.158. The low CAI values indicate weak codon preference in the chloroplast genome of plants in the Amaryllidaceae family, with Lycoris sprengeri having the highest CAI value of 0.162 and Hippeastrum rutilum having the lowest CAI value of 0.154. The optimal codon usage frequency (Fop) refers to the ratio of the optimal codon to its synonymous codon. The Fop value ranges from 0 to 1, where 1 indicates that only the optimal codon is used and 0 indicates that no optimal codon is used. The Fop values of the chloroplast genomes of these 64 species of Amaryllidaceae plants range from 0.349 to 0.366, with an average value of 0.357. The Fop values are relatively low, indicating that the chloroplast genomes of Amaryllidaceae plants all have optimal codons used, but not only the optimal codons are used, and codon preference is weak.
N-plot mapping analysis
The results of the effective codon number mapping analysis (Nec-plot mapping analysis) for chloroplast genome codons in plants of the Amaryllidaceae family are shown in Supplementary Fig. 5. Analysis shows that the chloroplast genomes of 64 species of Amaryllidaceae plants are distributed above the standard curve, indicating that codon usage is only affected by GC3.
Synonymous codon analysis
Analyzing the 59 synonymous codons (excluding ATG encoding Met, TGG encoding Trp, and 3 termination codons) in the chloroplast genomes of 64 species of Amaryllidaceae plants, the number of high-frequency codons with URSC > 1 is 29, with 13 codons ending in A, 13 codons ending in U, 2 codons ending in G, and 1 codon ending in C. This indicates that high-frequency codons in the chloroplasts of Amaryllidaceae plants prefer to use A or U codons and do not prefer to use codons ending in G or C; There are 28 low-frequency codons with URSC<1, 1 codon ending in A, 2 codons ending in U, 11 codons ending in G, and 14 codons ending in C. This indicates that codons ending in G or C have a lower frequency in the chloroplast genome of plants in the Amaryllidaceae family, while codons ending in A or U have a higher frequency; The codon of URSC=1 is UGG encoding Trp and AUG encoding Met. The synonymous codon URSC values of the chloroplast genomes of 64 plants in the Amaryllidaceae family are similar, with the largest codon URSC value being the AGA encoding Arg, followed by the UCU encoding Ser (Fig. 14).
PR2-plot mapping analysis
PR2-plot analysis (Parity Rule 2) is used to investigate whether there is a mutation imbalance between codon 3 bases A and T, and C and G. Using G3/(G3 + C3) and A3/(A3 + T3) as horizontal and vertical coordinates, create a scatter plot (Supplementary Fig. 6). The analysis shows that the coordinate points are not uniformly distributed in four regions, mainly in the regions of G3/(G3 + C3) < 0.5 and A3/(A3 + T3) < 0.5. Overall, the frequency of use of the third base T in the codon is higher than that of A, and the frequency of use of C is higher than that of G. If the preference for codon usage is entirely caused by base mutations, the frequency of A/T and G/C usage should be equal. The results of PR2-plot analysis showed that the codon usage preference of 64 species of Amaryllidaceae greensomes was caused by base mutation, natural selection and other factors.