Genotyping
SSR genotyping
A total of 378 bands was detected using 48 core SSR primer pairs (Table 1). Among these, 336 polymorphic bands were detected. The average number of polymorphic fragments was 7 ranging from 1 to 14. The maximum number of 14 polymorphic bands was detected by RM278 while RM311 is the least. The average PPB was 88.87% ranging from 50% to 100%. The average PIC value was 0.77 ranging from 0.19 to 0.88.Those data showed that core SSR in rice can produces rich bands and high polymorphic rate.
Table 1 Information and amplification results of SSR primers
Primer name
|
Chr.
|
Sequence(5’-3’)
|
Annealing temperature(℃)
|
TNB
|
NPB
|
PPB(%)
|
PIC
|
RM583
|
1
|
F:agatccatccctgtggagag; R:gcgaactcgcgttgtaatc
|
55
|
10
|
10
|
100
|
0.86
|
RM71
|
2
|
F:ctagaggcgaaaacgagatg; R:gggtgggcgaggtaataatg
|
55
|
8
|
8
|
100
|
0.84
|
RM85
|
3
|
F:ccaaagatgaaacctggattg; R:gcacaaggtgagcagtcc
|
55
|
9
|
9
|
100
|
0.85
|
RM471
|
4
|
F:acgcacaagcagatgatgag; R:gggagaagacgaatgtttgc
|
55
|
8
|
6
|
75
|
0.86
|
RM274
|
5
|
F:cctcgcttatgagagcttcg; R:cttctccatcactcccatgg
|
55
|
12
|
12
|
100
|
0.84
|
RM190
|
6
|
F:ctttgtctatctcaagacac; R:ttgcagatgttcttcctgatg
|
55
|
5
|
5
|
100
|
0.74
|
RM336
|
7
|
F:cttacagagaaacggcatcg; R:gctggtttgtttcaggttcg
|
55
|
7
|
7
|
100
|
0.79
|
RM72
|
8
|
F:ccggcgataaaacaatgag; R:gcatcggtcctaactaaggg
|
55
|
12
|
9
|
75
|
0.86
|
RM219
|
9
|
F:cgtcggatgatgtaaagcct; R:catatcggcattcgcctg
|
55
|
2
|
2
|
100
|
0.36
|
RM311
|
10
|
F:tggtagtataggtactaaacat; R:tcctatacacatacaaacatac
|
55
|
2
|
1
|
50
|
0.37
|
RM209
|
11
|
F:atatgagttgctgtcgtgcg; R:caacttgcatcctcccctcc
|
55
|
4
|
3
|
75
|
0.67
|
RM19
|
12
|
F:caaaaacagagcagatgac; R:ctcaagatggacgccaaga
|
55
|
12
|
9
|
75
|
0.86
|
RM1195
|
1
|
F:atggaccacaaacgaccttc; R:cgactcccttgttcttctgg
|
55
|
8
|
8
|
100
|
0.84
|
RM208
|
2
|
F:tctgcaagccttgtctgatg; R:taagtcgatcattgtgtggacc
|
55
|
5
|
4
|
80
|
0.75
|
RM232
|
3
|
F:ccggtatccttcgatattgc; R:ccgacttttcctcctgacg
|
55
|
10
|
10
|
100
|
0.87
|
RM119
|
4
|
F:catccccctgctgctgctgctg; :cgccggatgtgtgggactagcg
|
67
|
7
|
4
|
57.14
|
0.79
|
RM267
|
5
|
F:tgcagacatagagaaggaagtg; R:agcaacagcacaacttgatg
|
55
|
9
|
5
|
56.56
|
0.85
|
RM253
|
6
|
F:tccttcaagagtgcaaaacc; R:gcattgtcatgtcgaagcc
|
55
|
6
|
6
|
100
|
0.75
|
RM481
|
7
|
F:tagctagccgattgaatggc; R:ctccacctcctatgttgttg
|
55
|
7
|
7
|
100
|
0.80
|
RM339
|
8
|
F:gtaatcgatgctgtgggaag; R:gagtcatgtgatagccgatatg
|
55
|
8
|
8
|
100
|
0.79
|
RM278
|
9
|
F:gtagtgagcctaacaataatc; R:tcaactcagcatctctgtcc
|
55
|
14
|
14
|
100
|
0.85
|
RM258
|
10
|
F:tgctgtatgtagctcgcacc; R:tggcctttaaagctgtcgc
|
55
|
7
|
6
|
85.71
|
0.80
|
RM224
|
11
|
F:atcgatcgatcttcacgagg; R:tgctataaaaggcattcggg
|
55
|
8
|
8
|
100
|
0.84
|
RM17
|
12
|
F:tgccctgttattttcttctctc; R:ggtgatcctttcccatttca
|
55
|
9
|
9
|
100
|
0.78
|
RM493
|
1
|
F:tagctccaacaggatcgacc; R:gtacgtaaacgcggaaggtg
|
55
|
7
|
7
|
100
|
0.83
|
RM561
|
2
|
F:gagctgttttggactacggc; R:gagtagctttctcccacccc
|
55
|
8
|
5
|
62.50
|
0.85
|
RM8277
|
3
|
F:agcacaagtaggtgcatttc; R:atttgcctgtgatgtaatagc
|
55
|
7
|
7
|
100
|
0.75
|
RM551
|
4
|
F:agcccagactagcatgattg; R:gaaggcgagaaggatcacag
|
55
|
6
|
6
|
100
|
0.68
|
RM598
|
5
|
F:gaatcgcacacgtgatgaac; R:atgcgactgatcggtactcc
|
55
|
9
|
5
|
55.56
|
0.75
|
RM176
|
6
|
F:cggctcccgctacgacgtctcc; :agcgatgcgctggaagaggtgc
|
67
|
10
|
7
|
70
|
0.88
|
RM432
|
7
|
F:ttctgtctcacgctggattg; R:agctgcgtacgtgatgaatg
|
55
|
5
|
5
|
100
|
0.71
|
RM331
|
8
|
F:gaaccagaggacaaaaatgc; R:catcatacatttgcagccag
|
55
|
8
|
7
|
87.50
|
0.82
|
OSR28
|
9
|
F:agcagctatagcttagctgg; R:actgcacatgagcagagaca
|
55
|
10
|
9
|
90
|
0.80
|
RM590
|
10
|
F:catctccgctctccatgc; R:ggagttggggtcttgttcg
|
55
|
9
|
6
|
66.67
|
0.87
|
RM21
|
11
|
F:acagtattccgtaggcacgg; R:gctccatgagggtggtagag
|
55
|
11
|
11
|
100
|
0.87
|
RM3331
|
12
|
F:cctcctccatgagctaatgc; R:aggaggagcggatttctctc
|
50
|
6
|
4
|
66.67
|
0.80
|
RM443
|
1
|
F:gatggttttcatcggctacg; R:agtcccagaatgtcgtttcg
|
55
|
10
|
7
|
70
|
0.75
|
RM490
|
1
|
F:atctgcacactgcaaacacc; R:agcaagcagtgctttcagag
|
55
|
9
|
9
|
100
|
0.82
|
RM424
|
2
|
F4:tttgtggctcaccagttgag; R:tggcgcattcatgtcatc
|
55
|
5
|
5
|
100
|
0.72
|
RM423
|
2
|
F:agcacccatgccttatgttg; R:cctttttcagtagccctccc
|
55
|
7
|
7
|
100
|
0.82
|
RM571
|
3
|
F:ggaggtgaaagcgaatcatg; R:cctgctgctctttcatcagc
|
55
|
7
|
7
|
100
|
0.67
|
RM231
|
3
|
F:ccagattatttcctgaggtc; R:cacttgcatagttctgcattg
|
55
|
12
|
12
|
100
|
0.84
|
RM567
|
4
|
F:atcagggaaatcctgaaggg; R:ggaaggagcaatcaccactg
|
55
|
10
|
10
|
100
|
0.78
|
RM289
|
5
|
F:ttccatggcacacaagcc; R:ctgtgcacgaacttccaaag
|
55
|
10
|
10
|
100
|
0.88
|
RM542
|
7
|
F:tgaatcaagcccctcactac; R:ctgcaacgagtaaggcagag
|
55
|
8
|
7
|
87.50
|
0.84
|
RM316
|
9
|
F:ctagttgggcatacgatggc; R:acgcttatatgttacgtcaac
|
55
|
2
|
2
|
100
|
0.19
|
RM332
|
11
|
F:gcgaaggcgaaggtgaag; R:catgagtgatctcactcaccc
|
55
|
10
|
8
|
80
|
0.88
|
RM7102
|
12
|
F:taggagtgtttagagtgcca; R:tcggtttgcttatacatcag
|
55
|
3
|
3
|
100
|
0.43
|
SNPs genotyping
A total of 39,872 SNPs and 35,547 SNPs passed the minor allele frequency (MAF) lower limit of 0.05 by NlaIII-GBS only and MseI-GBS only, respectively. Then mergered NlaIII-GBS and MseI-GBS data, a total of 72,824 SNPs including 67,621 SNPs that aligned specific chromosomes and 5,023 SNPs that do not aligned specific chromosomes. The average MAF was 0.21 of 93 samples.
Linkage disequilibrium (LD) decay and Haplotype construction
In a total of 6,288,753 loci, among which 326,873 (5.198%) were heterozygous based on 67,621 SNPs. The 67,621 SNPs were unevenly distributed among the 12 chromosomes (Fig.1a); chromosome 1 contained the largest amount of makers (8,425), while chromosome 8 included the least (3,953). Among the 84,255 SNP pairs, R2 value had a minimum of 0.2 and an average of 0.73. 46,322 SNP pairs (54.98%) had R2values higher than 0.8, while 7,841 pairs (9.31%) were in complete LD (R2=1). The inter-marker genetic distance between all pairs, between pairs of SNPs with R2 inferior to 0.8, and between pairs with R2 equal or superior to 0.8 had average values of 154,130 bp, 171,768 bp, and 139,686 bp, respectively. LD, as represented by inter-loci R2 values, decreases as the physical distance between loci increases (Fig. 1b). The 12 chromosomes yielded a total of 6568 predicted haplotypes (Fig.1c), with chromosome 1 possessing the most haplotypes (776) and chromosome 10 possessing the least (349). The largest haplotype was composed of 95 SNPs. The longest haplotype spanned 200.0 kb, the average length of the haplotype is 33.71kb.
Population genetic structure analysis
We employed two types of genetic markers including three kinds of DNA markers and 15 agronomic traits to perform cluster analysis.
PC analysis
Principal component analysis was performed to select the first three PC(eigenvalue) and their cumulative contribution of variance accounted for 40.69%,39.76%, 40.10% and 15.76% by NlaIII-GBS only, by MseI-GBS only, by mergered NlaIII-GBS and MseI-GBS data and by SSR, respectively. PCA separated the 93 genotypes into two subgroups (Fig.4) which were consistent with the UPGMA and STRUCTURE results. W366 and W367 has been separately clustering (Fig. 2).
UPGMA
The unweighted pair-group method with arithmetic means (UPGMA) algorithm was performed on the 93 genotypes, which demonstrated that the 93 genotypes could be divided into 2 subgroups (Fig. 3). Group I included 1 to 3 samples, respectively. and group II contained 92 to 90 samples. The average genetic distance was 0.29 ranging from 0.02 to 0.55 based on mergered NlaIII-GBS and MseI-GBS data,which the two most closely related materials were W710 and W711, the two most greatest materials were W366 and W740.
Bayesian clustering
MAF <5% , 70824 SNPs were used to assess the population structure of the entire pool of rice germplasms. Delta K reached a maximum value at K=2, suggesting that the 93 rice samples were divided into two subgroups (consisting of 70 and 23 samples) (Fig. 4).In the population structure analysis, the results from K = 2 to K = 5 revealed the occurrence of gene introgression between groupІ and groupII, accounting for approximately 76.34% of the observed variations (calculated by K = 2).
The analysis performed using PCA, UPGMA and Bayesian clustering got similar results, The population structure is relatively simple,the matrix delamination is not distinctive.
Agronomic traits clustering
Clustering result based on 15 agronomic traits show in fig. 5a. the 92 materials are gathered together in addition to W669, showed a single genetic basis for the population.
Clustering of different category materials
We analyze population genetic information of the different category materials including 57 restoring lines, 19 maintainer lines and 17 special rices, respectively (Table 2), and UPGMA clustering (Fig. 5b, 5c, 5d) .The results showed the genetic basis of the restorer line was more abundant than that of the maintainer line,and the genetic basis of the special rice was wider than that of the conventional rice.
Table 2 Population genetic analysis of different category materials
Samples
|
Tajima’D
|
Range of IBS genetic distance
|
the average genetic distance
|
the two most closely related materials
|
the two most greatest related materials
|
whole materials(93)
|
1.66
|
0.0229-0.5452
|
0.3007
|
W710/W711
|
W366/W740
|
Restoring lines(57)
|
1.36672
|
0.0229-0.3927
|
0.2666
|
W710/W711
|
W685/W697
|
Maintainer lines(19)
|
0.43533
|
0.0242-0.3745
|
0.2293
|
W740/W741
|
W725/W738
|
Special rice(17)
|
0.62542
|
0.0285-0.5315
|
0.3280
|
W375/W380
|
W300/W366
|
Correlation analysis among genetic distance matrices by three-types of marker dataset
In the present study, the coefficients of correlation (R2) between the genetic distance matrices of NlaIII-GBS only and MseI-GBS only, NlaIII-GBS only and SSR, MseI-GBS only and SSR, mergered NlaIII-GBS and MseI-GBS and SSR were 0.88, 0.35, 0.27,0.33, respectively (Fig. 6), this may be due to the different number of markers used.
AMOVA and gene flow
Tajima’s D value was 1.66, which signifies low levels of both low- and high-frequency polymorphisms, indicating a decrease in population size and/or balancing selection. This resulted in more haplotypes and lacked rare alleles in the population. Analysis of molecular variance showed that the genetic variation within the population was 98% and between populations was 2%, which indicated the existence of slight genetic variation among 93 samples. The genetic differentiation coefficient (FST) between the two populations was 0.61, and gene flow (Nm) was 0.16. Further investigation showed that the gene flow of selfing crops was the smallest, and that of annual herbaceous plants was the lowest. If Nm > 1, which indicates that the level of gene flow between populations is high, then genetic differentiation among populations is small; if Nm > 4, then gene communication between populations is more adequate and genetic differentiation is smaller; and Nm < 1 indicates population differentiation may have occurred due to genetic drift. The gene flow was 0.16, which indicates that the gene flow among rice populations in the Qinba region is lower, but nearly 2.5-fold larger than that of conventional inbred plants, which may result in long-term artificial selection, leading to reduced genetic differentiation.