Sequencing Information Statistics
Whole-genome sequencing of 150 japonica rice germplasm resources produced a total of 67800533 SNPs, with an average of 443,140.74 SNPs per sample, 59.5% of which were positioned in the intergenic region, 7.01% in exons (Fig. S2b), and 3.95% resulted in missense mutations (Fig. S2a).
Analysis of kinship and population structure
The kinship analysis could resolve the genetic distance between the samples, and the kinship among 150 japonica germplasm showed that there was no lineage differentiation within the population, which indicated that these japonica rice germplasm resources were rich in genetic variation without homogenization and could be used as parents in japonica rice Varieties breeding (Fig. 1). Based on SNPs, an phylogenetic tree was constructed after going through 1000 replications by the MEGA software neighbor-joining algorithm, and the results showed that 150 japonica rice could be categorized into four subgroups (Fig. 2 a). Population genetic structure analysis can provide information on the linage composition of individuals. The population structure was analyzed by admixture software, and clustering was performed according to the assumed number of subclusters (K values) of 1-10, respectively. ∆ K analysis showed that the optimal number of subclusters was 4 (Fig. 2 b c). Principal components analysis was able to supplement the phylogenetic analysis and the results showed that there was no subpopulation differentiation within the population (Fig. 2 d).
Comparative analysis of agronomic traits between Guiyang and Hainan
As shown in Table 1, the differences in grain length and seed setting rate of these 150 japonica germplasm resources planted between Guiyang and Hainan were not significant. The seed aspect ratio was extremely significantly higher in Hainan than in Guiyang, however, the remaining seven agronomic traits were measured in Hainan with extremely significantly decreased values compared to Guiyang. The coefficients of variation were calculated for 10 agronomic traits separately, ranging from 6.37% to 93.09 in Guiyang and 6.32% to 119.71% in Hainan. The coefficient of variation for grain thickness was the smallest, while the coefficient of variation for number of shriveled grains was the largest, which was in line for both Hainan and Guiyang. In addition, plant height, panicle length, grain aspect ratio, filled grain number, grain number per panicle and seed setting rate showed moderate variation in both Guiyang and Hainan, which indicated that the japonica germplasm resources were rich in phenotypic variation.
Analysis of agronomic traits of japonica rice from different ecological regions
In Guiyang, the overall agronomic traits of japonica rice from different ecological regions were as follows: higher plant height from Yunnan (Fig. 3a), longer panicle from the United States and Yunnan (Fig. 3b), larger grain length from the United States (Fig. 3 c), wider (Fig. 3d) and thicker grain from Anhui (Fig. 3 f), larger grain aspect ratio from Heilongjiang (Fig. 3e), japonica rice from Liaoning had more filled grains (Fig. 3g) and more grain number per panicle (Fig. 3i), more number of shriveled grain from Anhui (Fig. 3h) and Jilin had poorer seed setting rate (Fig. 3j).
In Hainan, the overall agronomic traits of japonica rice from different ecological regions were as follows: Higher plant height from Yunnan (Fig. 4 a), longer panicle from Yunnan and USA (Fig. 4b), longer and wider grain from USA (Fig. 4c d), larger grain aspect ratio from Heilongjiang (Fig. 4e), more filled grain from USA (Fig. 4g), more grains per panicle from Yunnan (Fig. 4i), while the differences in seed setting rate of japonica rice among different ecological regions were not significant (Fig. 4j) and the differences in grain thickness of japonica rice from ecological regions other than Jiangsu and Liaoning were not significant (Fig. 4f).
Diversity index analysis of japonica rice from different ecological regions
The frequency distribution graphs (Fig. S3, Fig. S4) showed that all 10 agronomic traits basically conformed to the positive distribution. As shown in Table 2, the H' of the 10 agronomic traits ranged from 0.64 to 1.04 and from 0.31 to 0.97 for Guiyang and Hainan, respectively. The diversity indexes of grain number per panicle, filled grain number and grain width both in Guiyang and Hainan were relatively high. According to the " Data Quality Standard for Rice Germplasm Resources", the peak interval of height in Guiyang contained medium-dwarf varieties, while that in Hainan contained dwarf varieties. The peak intervals for panicle length in both Guiyang and Hainan contained short panicle varieties, and that for grain length contained medium grain length varieties. The peak interval of grain width in Guiyang contains wide grain varieties, while that in Hainan contains narrow grain varieties. The peak interval of grain number per panicle in Hainan contains fewer grain number per spike of the variety than that of Guiyang.
Cluster analysis of agronomic traits of japonica rice from different ecological regions
As shown in Figure. 5a, in Guiyang, 150 japonica rice can be clustered into 4 categories, of which the first category contains a total of 118 varieties from 12 regions, the second category contains a total of 13 varieties from 9 regions, the third category contains 16 varieties from 8 regions, and the fourth category contains 3 varieties from 2 regions. As seen in Table 3, the first category of varieties had the widest and thickest grains, the highest filled grains and the highest seed setting rate, for the second category, the tallest plant height and the longest panicle, for the third category, the longest grain and the largest grain aspect ratio, and for the fourth category, the highest Number of shriveled grains.
As shown in Figure. 5b, in Hainan, 150 japonica rice can be clustered into 4 categories, of which the first category contains a total of 145 varieties from 14 regions, the second category contains a total of 3 varieties from2 regions, the third category contains 1 variety from 1 region, and the fourth category contains 1 variety from 1 region. As seen from Table.3, the first category had the widest grain and the most grain number per panicle, of the second category the largest grain length and grain aspect ratio, of the third category the largest grain thickness, and of the fourth category the tallest plant, longest panicle and number of shriveled grains.
Correlation analysis of agronomic traits of japonica rice from different ecological regions
Figure 6 showed the closely correlation among the 10 agronomic traits. Both in Guiyang and Hainan, there were significant positive correlations between plant height and panicle length, grain length and grain aspect ratio, grain width and grain thickness, and filled grain number and grain number per panicle, while significant negative correlations were found between grain width and grain aspect ratio, number of shriveled grains and seed setting rate, and grain aspect ratio and grain thickness.
In Guiyang, seed setting rate was negatively and significantly correlated with plant height and panicle length, while that with grain length and grain thickness was positively and significantly correlated. The grain number per panicle was significantly negatively correlated with grain length and grain aspect ratio, and the number of shriveled grains was significantly negatively correlated with grain length, grain thickness and number of solid grains. However, the statistical results of the above correlations were not presented in Hainan.
Principal component nomination
As shown in Table 4, in Guiyang, the contributions of the four principal components were 28.22%, 27.17%, 20.39%, and 12.60%, which cumulatively explained 88.38% of the variation, and that in Hainan were 36.75, 24.70, 16.35, and 11.76, which cumulatively explained 89.56% of the variation.
In Guiyang The first principal component can be named as shrived grain number factor for the highest loading coefficient of shrived grain number, the second principal component as grain shape factor for highest loading coefficient of grain width and grain thickness, the third principal component as grain number factor for the largest loading coefficients of filled grain number and grain number per panicle, and the fourth principal component as length factor for the highest loading coefficients of panicle length and plant height.
In Hainan, the first principal component can be named as the grain number factor for the highest loading coefficient of grain number per panicle, panicle length and filled grain number, the second principal component as the grain shape factor for the highest loading coefficient of grain aspect ratio and grain width, the third principal component as the seed setting rate factor for the highest loading coefficient of seed setting rate and number of shriveled grains, and the fourth principal component as the grain shape factor for the highest loading coefficient of grain length and grain thickness.
Principal components analysis (PCA) of agronomic traits of japonica rice from different ecological regions
In Guiyang (Fig. 7a), the first principal component contained seven positive vectors, namely, panicle length, number of shriveled grains, plant height, grain number per panicle, filled grain number, grain width, grain thickness, while three negative vectors, namely, grain aspect ratio, grain length, and seed setting rate. At this dimension, it was obvious to select varieties with longer grains, larger grain aspect ratio, and higher seed setting rate, such as Kendao10 (JR-LN7), Longyang11 (JR- LN8), and Wuyoudao3Hao (JR-LN2), etc. The second principal component contained 6 positive vectors and 4 negative vectors. The positive vectors are panicle length, number of shriveled grains, plant height, grain number, grain aspect ratio, grain length, and negative vectors are filled grain number, grain width, grain thickness, and seed setting rate. Varieties with wider grain, thicker grain and higher seed setting rate can be clearly selected based on this dimension, such as Liaojing399 (JR-LN11), Yj07 (JR-LN29), Yungengnuo (JR-LN38), etc.
In Hainan (Fig. 7b), the first component contains 7 positive vectors and 3 negative vectors. The positive vectors are panicle length, number of shriveled grains, plant height, grain number, filled grain number, grain aspect ratio, grain length, while the negative vectors are, respectively, grain width, grain thickness, seed setting rate. At this dimension it was obvious to select varieties with wider grain, thicker grain and higher seed setting rate, such as Longgeng31 (JR-LN6), Dongnong601 (JR-LN138). The second component contained 7 positive vectors and 3 negative vectors. The positive vectors are panicle length, plant height, grain number per panicle, filled grain number, grain width, grain thickness, seed setting rate, while the negative vectors were grain aspect ratio, grain length, and number of shriveled grains. Varieties with longer grain and larger grain aspect ratio can be selected based on this dimension.
Comprehensive Evaluation of Japonica Rice from Different Ecological Regions
Factor analysis with SPSS19.0 software was run to calculate the factor scores of all japonica rice variety, the character roots and variances of the four principal components. Next, the square root of the character root values and the scores of each variety in each principal component were further calculated. Finally, the comprehensive score (y value) of each variety was calculated, which can provide a comprehensive evaluation of the 10 agronomic traits of the japonica rice germplasm resources. As shown in Table S2, the top 10 varieties with a better overall agronomic trait in Guiyang are, respectively, Yungengnuo (JR_LN38), Chugeng37 (JR_LN32), Chugeng28 (JR_LN34), Fuhedao255 (JR_LN50), Guangminggeng2 (JR_LN96), Chugeng7 (JR_LN39), Chugeng40(JR_LN33), Liaogeng401(JR_LN102), Bigeng44(JR_LN107), Sugeng45(JR_LN76), while in Hainan (Table S3), the top 10 varieties with a better overall agronomic trait are, respectively, Chugeng7 (JR_LN39), Lemont (JR_LN149),Chugeng32 (JR_LN43), Chugeng28 (JR_LN34), Bidao126 (JR_LN105), Tenggeng2 (JR_LN45), Chugeng26 (JR_LN41), Tengxi138 (JR_LN153), Hexi27-2 (JR_LN145), Chugeng39(JR_LN44).
Construction of optimal regression equations for comprehensive evaluation
Stepwise regression analysis was performed with SPSS19.0 software to construct the optimal regression equations for the comprehensive evaluation of japonica rice in Guiyang and Hainan, respectively (Table 5), as follows, Y= -235.103+2.08X1-0.359X2+9.906X3-76.488X4+0.364X5-13.705X6-40.472X7+0.262X8-1.385X9+3.731X10;Y= -693.983+1.499X1+0.376X2+6.98X3+16.969X40.255X5+24.456X6+147.324X7+0.581X8+18.422X9+13.006X10,X1-X10 represent plant height, grain number per panicle, panicle length, grain thickness, number of shriveled grains, grain width, seed setting rate, filled grain number, grain length, and grain aspect ratio, respectively.