Detection of atp6, B-atp6-orfH79 and orfH79 in Oryza rufipogon
In this study, each individual in all populations can be amplified products using the primers for the atp6 gene (Table 2), indicating the wide presence of atp6 gene in O. rufipogon populations, which was consistent with the previous study that the mtDNA is conservative [19]. To verify whether orfH79 and the variants (GSV) were always accompanied by B-atp6, the specific primers for orfH79 and B-atp6-orfH79 were employed. Both primer pairs amplified products for each individual of the HK, GZ, PS, TL, and YJ populations (Table 2), which indicated that the GSV were always accompanied with B-atp6. Conversely, PCR products were not amplified with both primer pairs for the remaining twelve populations, i.e., NHNC, WN, QH, NN, LB, XZ, HZ, GP, YL, FC, BH and DX. The detection of orfH79 gene from the five populations of O. rufipogon in China at a medium frequency about 30%, provided rich potential CMS sources for rice breeding.
Table 1
Detailed information about 17 populations of Oryza rufipogon collected from China including geographical location, altitude range, longitude, latitude and ecotype
Population localities
|
Population code
|
Species
|
Number
|
Altitude
(m a.s.l.)
|
Longitude
(E)
|
Latitude
(N)
|
Ecotype
|
Hainan, Sanya
|
NHNC
|
Oryza rufipogon
|
15
|
3
|
109°29′
|
18°17′
|
stolon
|
Hainan, Wanning, Dongao
|
WN
|
Oryza rufipogon
|
25
|
45
|
110°.37
|
18°.73
|
stolon
|
Hainan, Qionghai, Zhongyuan
|
QH
|
Oryza rufipogon
|
25
|
56
|
110°29′
|
19°06′
|
erect
|
Hainan, Haikou, Longhua
|
HK*
|
Oryza rufipogon
|
25
|
92
|
110°20′
|
19°57′
|
Head-up tilt
|
Yunnan, Yuanjiang, Donger
|
YJ
|
Oryza rufipogon
|
5
|
700
|
102°
|
23°59′
|
erect
|
Guangxi, Longan, Natong
|
NN
|
Oryza rufipogon
|
20
|
98
|
108°12′
|
22°40′
|
erect
|
Guangxi, Laibin, Shiya
|
LB
|
Oryza rufipogon
|
30
|
133
|
109°27′
|
23°28′
|
Head-up tilt
|
Guangxi, Xiangzhou, Yunjiang
|
XZ
|
Oryza rufipogon
|
30
|
336
|
109°46′
|
24°07′
|
Stolon/erect
|
Guangxi, Zhongshan, Chengxiang
|
HZ
|
Oryza rufipogon
|
30
|
438
|
111°19′
|
24°34′
|
Head-up tilt
|
Guangxi, Guiping, Xunwan
|
GP
|
Oryza rufipogon
|
30
|
113
|
110°08′
|
23°24′
|
stolon
|
Guangxi, Yulin, Fumian
|
YL
|
Oryza rufipogon
|
30
|
237
|
110°04′
|
22°23′
|
Head-up tilt
|
Guangxi, Fancheng, Huashi
|
FC
|
Oryza rufipogon
|
30
|
71
|
108°11′
|
21°45′
|
stolon
|
Guangxi, Beihai, Fucheng
|
BH
|
Oryza rufipogon
|
27
|
54
|
109°18′
|
21°35′
|
Stolon/ head-up tilt
|
Guangdong, Gaozhou, Zhenjiang
|
GZ
|
Oryza rufipogon
|
30
|
91
|
110°42′
|
21°51′
|
stolon
|
Guangdong, Gaozhou, Zhenjiang
|
PS
|
Oryza rufipogon
|
25
|
-57
|
110°42′
|
21°48′
|
head-up tilt/ stolon
|
Guangdong, Gaozhou, Zhenjiang
|
TL
|
Oryza rufipogon
|
26
|
83
|
110°44′
|
21°47′
|
head-up tilt/ stolon
|
Jiangxi, Dongxiang, Gangshangji
|
DX
|
Oryza rufipogon
|
24
|
276
|
116°31′
|
28°05′
|
Head-up tilt
|
*, indicate that this population has disappeared since 2015. |
Table 2
Primers used in this study and PCR products amount in each relevant population
Primer name | Primer sequence | PCR products amount |
orf79 | F: 5’- ATGACAAATCTGCTCCGATG − 3’ | GZ(30),HK(30),YJ(5), TL(25),PS(26) |
R: 5’- CTTACTTAGGAAAGACTAC − 3’ |
B-atp6-orfH79 | F: 5'-TCCTTGTCTATGGCGGTAA-3' | GZ(30),HK(30),YJ(5), TL(25),PS(26) |
R: 5'-GAGCAAACCACCACTGTCC-3' |
atp6 | F: 5'-CTGAATGGAGGAACGGCGAT-3' | Every individual |
R: 5'-AGCATAGTCCAAGCGAACCC-3' |
F, forward primer; R, reverse primer. Populations and the numbers that could be detected amplified products |
As the conserved gene sequence, no difference was found among all the sequence of atp6 in the O. rufipogon populations. Fifteen individuals were randomly selected for sequencing from each population, except for the YJ population with only five individuals. The sequencing results revealed that only one haplotype was identified in all individuals among the same population.
Among five populations, three haplotypes (named as H1, H2, and H3, respectively) were detected with the primers for orfH79, and three haplotypes (named as BH1, BH2, and BH3, respectively) were detected with the primers for B-atp6-orfH79. Specifically, The H1, H2, and H3 sequences were identical to the corresponding sequences of BH1, BH2, and BH3 in each population, respectively (Table 3, Table 4 and Fig. 1). For instances, the PS and GZ populations shared H1 and BH1, while the TL and HK populations shared H2 and BH2. Furthermore, H3 and BH3 was only detected in the YJ population.
Table 3
OrfH79 and its variants (GSV), and B-atp6-orfH79 and its variants (B-atp6-GSV) among the populations of Oryza rufipogon in China
DNA section |
Haplotype | Population (n) | Haplotype | Population (n) |
H1 | PS16(15), GZ30(15) | BH1 | PS16(15), GZ15(15) |
H2 | TL(15), HK(15) | BH2 | TL22(15), HK6(15) |
H3 | YJ(5) | BH3 | YJ5(5) |
The number of individuals sequenced in each population in the brackets |
Table 4
The identity between different haplotypes
Haplotype | Haplotype | Identity (%) | Haplotype | Haplotype | Identity (%) |
BH1 | BH2-HL | 99% | BH2-HL | BH3 | 98% |
BH1 | BH3 | 98% | BH2-HL | BT-Dian 1 | 97% |
BH1 | BT-Dian 1 | 98% | BH2-HL | Lead-Liao | 97% |
BH1 | Lead-Liao | 97% | BH3 | BT-Dian 1 | 100% |
BT-Dian 1 | Lead-Liao | 100% | BH3 | Lead-Liao | 99% |
BH2-HL, indicates BH2 haplotype and B-atp6-orfH79, they are identical; BT-Dian 1, indicates B-atp6-orf79, it was detected in the CMS sterile line of CMS-BT and CMS Dian 1; Lead-Liao, indicates the haplotype of B-atp6-L-orf79,this haplotype was detected in CMS-Liao and CMS-Lead sterile line |
Sequence structure characteristics of B- atp6- GSV, phylogenetic analysis and the uniform chimeric trait
-
atp6-orf79, B-atp6-orfH79, and B-atp6-L-orf79 are the previously published haplotypes in O. rufipogon [1–4]. Among three detected haplotypes in this research, BH2 was identical to B-atp6-orfH79, whereas BH1 and BH3 were novel haplotypes in O. rufipogon populations. B-atp6-orf79, B-atp6-orfH79, B-atp6-L-orf79, BH1, and BH3 constitute the B-atp6-GSV region. Previously, B-atp6-orf79 has been detected in the CMS-Dian 1 and CMS-BT cytoplasm types, and B-atp6-L-orf79 was detected in the CMS-Lead and CMS-Liao types [4, 44]. A total of eighteen haplotypes (i.e. BH1-BH18) were summarized from the GenBank, and their common characteristics and unique variation were demonstrated in Fig. 2. Their accession number in the GenBank of NCBI database, and other information such as population and species have been provided in Additional file 1: Table S1. All were chimeric sequences, containing a 671 bp conserved sequence composed of B-atp6 (619 bp) and 52 bp downstream of the B-atp6 (DS). B-atp6 and DS was identical to the corresponding sequences of atp6 and its corresponding downstream of O. sativa and O. rufipogon [54]. There was no difference in sequence among these eighteen haplotypes, except for BH18 and a single nucleotide polymorphism in 668 position (C turned into G) of BH10. The complex variable sequence (VS) connected with the DS and GSV, is a 176 bp sequence including five insertion or deletion and more than 30 single nucleotide polymorphisms, as shown in Fig. 2.
Maximum-Likelihood Phylogenies analysis of B-atp6-GSV of eighteen haplotypes (Fig. 3) indicated that three clades are formed with a high support rate as follows. Owing to primer selection, the last 26 bases of GSV were not amplified. Thus these 26 bases were not included in the present and subsequent analysis. BH16 and BH17 formed a clade, while BH1, BH2 and BH13-15 formed another clade. Additionally, the other haplotypes (i.e.BH3-12, BH18) formed the third clade (Fig. 3). The new haplotype BH1 was close to the BH13, while another new haplotype BH3 and the existing BH12 clustered together as a sub-clade. BH16 and BH17 formed a clade with no sequential relationship between them, and both only exist in O.rufipogon distributing in south Asia (only 4 time were detected). Specifically, BH17 was only detected in Thailand, while BH16 was detected in India and Sri Lanka [36]. Additionally, the BH1-2 and BH13-15 formed another clade with a high support rate of each branch, which all were only detected in O. rufipogon. As a new haplotype, BH1 is located on Chinese mainland such as Guangdong province, while the BH2 haplotype both distribute in Chinese mainland and Hainan Island, including Zhenjiang town of Guangdong province and Longhua district of Hainan province (Fig. 1). Generally, the BH2, located in the south of BH1 in the distribution of populations, also was reported in Thailand. The BH13 was located in China, while both BH14 and BH15 are located in India [36]. However, the WN and QH population close to the BH13 location, did not carry the B-atp6-GSV, indicating its independent occurrence.
The third clade has 100% support rates that all the haplotypes from O. sativa gather in these branches, while the support rate is relatively low inside the clade. The BH6 and BH8, located in India (They were detected once, respectively), were only carried by O. rufipogon, and BH4-5 and BH9-10 are grouped into one branch and carried by O. rufipogon, O. sativa or O. bathii, respectively. The BH4 haplotype could be found in all these three Oryza species, while BH5 and BH9 haplotypes were carried by both O. rufipogon and O. sativa. However, the BH10 haplotype only could be found in O. sativa grown in Australia. Except for the BH6, the other haplotypes detected in O. sativa are located in this third-level clade. All the haplotypes carried by Australia O. sativa distribute in this clade. For BH5, the sources of O. sativa are very wide. The O. rufipogon populations distributing in India, Bangladesh and Thailand carried BH4, BH5, BH9 haplotypes, respectively. The BH7 haplotype was found in O. rufipogon populations from India, and the O. sativa populations from Philippines and Africa (Nigeria, Madagascar) [36]. The BH11-12 haplotypes were only detected in China. Apart from one material from Thailan, the BH18 haplotype was only found in the materials from China. As a novel haplotype only found in Yunnan province, China, the forming reason of BH3 needs to be further studied. Generally, the geographical distribution of haplotypes has certain geographical characteristics. Nonetheless, the B-atp6-GSV does not obey the distribution of general geography. There are certain species boundaries, but not absolutely. For example, all haplotypes from O. sativa are clustered in the third branch, which also could be found in the wild rice.
The populations carrying the B-atp6-GSV appear irregularly, have several different phenotypes (Additional file 2: Fig. S1.). The enough haplotypes of B-atp6-GSV provide a clear structure, including their consistency and variation. We speculated that B-atp6 has the uniform chimeric trait in B-atp6-GSV structure, because of the inverted repeat sequence like GGGCGGGGG……GGGGGCGGG in the B-atp6 structure. Li et al. [34] reported additional seven haplotypes of orfH79, i.e., W11, W15(34), W20, W21, W29, W34, and W46, together with the GSV section in eighteen BH haplotypes of B-atp6-GSV that could be divided into ten GSV haplotypes that belong to three species (i.e. O. rufipogon, O. nivara and O. sativa). Seventeen haplotypes have been found in the GSV section belonging to six species of Oryza (Table 5), due to the completely identical sequences, which demonstrated the inconformity to the classification results of B-atp6-GSV. Concretely, BH4-BH6, BH8-BH10 have the same GSV sequence to W46 and the orf79, while BH2, BH14, BH15 have the same GSV sequence to W42, W45, YtA, and the orfH79. Additionally, BH7 is the same as the L-orf79. All these seventeen GSV haplotypes can be translated into eleven different amino acid sequences as the variation of W34, W46, LR794109 and orf79 codes for the same amino acid (Fig. 4). Haplotype W20 was identical to W15 and W42 except for the variation in the 26 bases. The first 34 bases of the GSV, identical to the corresponding sequence of cytochrome oxidase subunit II (COXII), was shared with the nuclear DNA, which indicated that all GSV variants are chimeric sequences.
Table 5
Variable nucleotide sites in gametophytic male sterility orfH79 gene and its variance (GSV).
Haplotype or accession number | No. | Nucleotide Site |
4 | 1 3 | 6 7 | 9 5 | 1 4 2 | 1 4 6 | 1 4 7 | 1 7 8 | 2 2 6 |
H2 | A | C | T | T | T | A | C | T | ? |
H1 | . | G | . | . | G | . | A | C | . |
H3 | G | G | . | . | A | C | T | . | . |
orfH79 | . | . | . | . | . | . | . | . | G |
orf79 | G | G | . | . | A | . | A | C | G |
L-orf79 | G | G | . | . | A | C | A | C | G |
YtA | . | . | . | . | . | . | . | . | G |
W11 | . | . | . | . | A | . | . | C | G |
W15 | . | . | . | . | . | . | . | . | G |
W20 | . | . | . | . | . | . | . | . | A |
W21 | . | . | . | . | A | . | A | C | G |
W29 | . | . | . | C | A | C | T | C | G |
W34 | G | G | C | . | A | . | A | C | G |
W42 | . | . | . | . | . | . | . | . | G |
W46 | G | G | . | . | A | . | A | C | G |
LR794116 (H12) | G | G | . | . | A | C | T | . | G |
LR794113 (H9) | G | G | . | . | A | . | A | C | G |
LR794114 (H10) | G | G | . | . | A | . | A | C | G |
LR794110 (H8) | G | G | . | . | A | . | A | C | G |
LR794112 (H5) | G | G | . | . | A | . | A | C | G |
LR794109 (H5) | G | G | . | . | A | . | A | C | G |
LR794115 (H11) | G | G | . | . | A | C | T | C | G |
LR794123 (H18) | G | G | . | . | C | C | T | C | G |
LR794111 (H7) | G | G | . | . | A | C | A | C | G |
LR794108 (H4) | G | G | . | . | A | . | A | C | G |
LR794119 (H14) | . | . | . | . | . | . | . | . | G |
LR794120 (H15) | . | . | . | . | . | . | . | . | G |
LR794118 (H2) | . | . | . | . | . | . | . | . | G |
LR794117 (H13) | . | . | . | . | . | . | . | C | G |
LR794121 (H16) | . | G | . | . | A | . | A | C | G |
LR794122 (H17) | . | G | . | . | G | . | A | C | G |
., indicates that the character states are the same as H2 haplotype;?, Indicate the character states are unknown. Haolotypes YtA, W11, W15, W20, W21, W29, W34, W42, and W46 have been described in the reference of Li et al, 2008. L-orf79 has been described in the reference of Kazama et al 2016, while orf79 described in the report of Wang et al 2006. LR794116- LR794122 were the sequences of B-atp6-orfH79 in the reference of He et al. 2020 and their GSV sections were renamed as H2, H3-H18, according to the above mentioned method in this study. The same base plate color indicates the same haplotype. |
Population Differentiation
The hierarchical AMOVA indicated the occurrence of variation among all populations (FST = 1; all partitions were significant at P < 0.001). According to the VS region of B-atp6-GSV (Fig. 2), the YJ population was formed as group 1, while other populations including the HK, GZ, PS, and TL populations, comprised as group 2. Of the total variation, 33.3% was partitioned among groups and 66.7% among populations within the groups (all partitions were significant at P < 0.001) (Table 6). Furthermore, there was no variation found within each population.
Table 6
Hierarchical analysis of molecular variance (AMOVA) of samples of Oryza rufipogon based on nucleotide sequences of Batp6-VGS
Source of variation | d.f. | Sum of squares | Variance components | Percentage of variation | Fixation indices |
Among group | 1 | 3.462 | 0.16667Va | 33.33% | FCT=0.33333 |
Among population | 3 | 15 | 0.33333Vb | 66.67% | FSC=1 |
Within population | 60 | 0.000 | 0.00000Vc | 0 | |
Total | 64 | 18.462 | 0.5 | | FST=1 |
Phylogenetic Reconstruction For Gsv
In this study, three GSV haplotypes including H1, H2 (orfH79) and H3 were detected in Oryza rufipogon. In addition, orf79 and L-orf79 are well-known gametophytic CMS genes. For example, orf79 is the sterility gene of CMS-BT and CMS-Dian1 (Dian-1 type CMS), while L-orf79 is the sterility gene of CMS-Lead (Lead type CMS) and CMS-Liao. Furthermore, W11, W21, W29, W34, W42, and W46 are different variants of orfH79 in the wild rice [34]. As shown in Fig. 5, the Maximum likelihood estimate analysis can not reflect the genetic relationship among six different species according to their comprehensive character. Corresponding amino acid of variable nucleotide sites for each haplotype of GSV in Fig. 4, the variable nucleotide site are shown in Table 5. Overall, single nucleotide polymorphisms of 7 loci bring eleven different amino acid sequences.