Population structure and germination differences
Analysis of SNPs in an association panel of 311 natural rice accessions using phylogenetics and fixation index analysis revealed that the japonica (n=160) and indica (n=138) varieties are easily differentiated (Fig. 1). The mixed subgroup (n=13), comprising mainly Australian and Basmati cultivars, had more members similar to the indica subpopulation.
The germination index (GI) of harvested seeds was used to test the degree of seed dormancy for each accession. The GI average across all 311 lines was 0.24, with the japonica GI slightly higher, at 0.25, while the indica GI was 0.23, with a higher standard deviation between accessions (Table 1). The mixed group had the lowest GI (0.15): while their group was too small to draw any general conclusions, their inclusion in the subsequent GWAS analysis may provide insights into the mechanisms of seed dormancy and pre-harvest sprouting.
GWAS and candidate gene selection
A total of 519,158 SNPs were included for association analysis. We used two different models for GWAS, fastLMM and MLMM (Fig. 2). The QTLs were assigned when the peak SNP correlation exceeded the calculated threshold. Eight QTLs, each of 200 kb, were identified (Fig. 2; Table 2).
Expression of candidate genes within those QTLs were examined using two transcriptomic datasets from the GEO database. Dataset GSE115371 provided gene expression data at 0 h, 1 h, 3 h, 12 h, 1 d, 2 d, 3 d and 4 d timepoints after imbibition, for an Australian variety (Amaroo); only timepoints from aerobically grown samples were used (Narsai et al. 2017). DEGs were identified by comparing expression at different timepoints to expression at 0 h. Dataset SRP277875 provided expression data for a japonica variety (02428) and an indica variety (YZX) at 2, 3, and 4 days after imbibition compared with a 0 h timepoint (Yang et al. 2020). The data of the japonica variety was used for expression comparison of candidate genes (Fig. 3a).
Ten candidate genes from five QTLs identified by GWAS were differentially expressed across the first four days post-imbibition (Table 2; Fig. 3). Five genes had similar expression patterns in both datasets (Fig. 3), namely LOC_Os07g17400, OsNFYA5 (expressed at 0 h and soon after imbibition); LOC_Os08g06090, PPT1 (2 d post-imbibition); and ROMT9 (3 d post-imbibition). Two further genes, OsYAB1 and OsFLA16, had similar patterns of expression in the two datasets, that peaked at 2 or 3 d post-imbibition. A further three genes — OsPUB6, LOC_Os09g12650, and LOC_Os09g21370 — has quite distinct patterns of expression, possibly due to varietal differences.
Bioinformatic analysis of the candidates
SNP variations in five candidate genes had previously been associated with phenotypes such as awn length, heading date, yield, and spikelet length (RiceVarMap; Online Resource 3). Three candidates, LOC_Os07g17400, PPT1, and OsFLA16, are reported to associate with awn length.
Haplotype analysis confirmed that three SNPs in our candidate genes associated with GI caused non-synonymous mutations (Online Resource 4). In ROMT9, a glycine to serine change caused GI to drop. Similarly, a threonine to alanine change in PPT1 reduced GI, but the number of T-haplotype varieties was very small so further verification is required. An asparagine to aspartate change in LOC_Os09g12650 caused GI to increase. Phylogenetic analysis of the predicted protein encoded by LOC_Os07g17400 reveals homology to AIRP family proteins in maize, Triticum, and Arabidopsis (Online Resource 6).
eSNPs associated with seed germination
To further find eSNPs associated with seed germination, we compared expression in the japonica and indica varieties in dataset SRP277875 (Yang et al. 2020). More than twice as many genes were up-regulated in the japonica than in the indica variety (Online Resource 5), suggesting that the germination difference between these two rice subpopulations can be caused by the differential expression of genes.
Five genes known to be involved in seed germination were differentially expressed in this dataset (Table 3). OsGA20ox1, OsGAP, and Os9BGlu33 were up-regulated in japonica 02428, while OsGA3ox2 and OsPP2C51 were up-regulated in indica YZX. The timing and extent of differential expression differed for each gene. The largest change was in OsGAP, whose expression was 6.5× higher in the japonica variety at 0 h, and 2.3× higher at the 3 d timepoint. We found four mutations in the promoter of Os9BGlu33 that are associated with a change in GI (Fig. 4; Online Resource 1).