Phenotypes of RIL and GWAS population
To detect the influence of heat treatment on rice seedlings, the phenotypic characters of RIL and the natural rice population with or without heat treatment were statistically analyzed (Fig. 1 and S1). The average and extreme values of the shoot length (SL), dry weight (DW), and fresh weight (FW) of RIL and natural population decreased after heat stress treatment, which confirmed that heat stress had a negative impact on rice growth. The measured values between the RIL control and the treatment groups were different, and the measured values were higher than the average value of the corresponding parents, which suggested the existence of super-parent separation of biomass traits (Table S1). The measured traits of the natural rice population and the RIL were tested to determine whether they followed a normal distribution (Fig. S2 and S3). It showed that shoot length, dry weight, and fresh weight were basically in accordance with a normal distribution in both RIL and GWAS population. According to the traits data, it was determined that the biomass traits of rice seedlings in the experiment are quantitative traits controlled by typical poly-genes (Table S1). Biomass traits of the control and treatment groups in RIL were significantly lower than those in the parental groups. However, the survival rate (SR) of the RIL was higher compared to the parents (Fig. 1, Table S1), which suggested that the RIL had pyramided more heat-tolerant genes from the parental materials during the recombination process. The difference in biomass traits between the control and treatment groups of GWAS population was less than that of RIL, while the SR of GWAS population was significant higher than that of RIL (Fig. 1, Table S1). These results inferred that there were more natural heat-tolerant genes and genetic diversity in the natural population. Exploring the heat-resistant QTLs of rice seedlings provides a possibility for comparison of control and post-treatment QTLs mapping to obtain QTLs for heat-resistant effects.
The correlations among biomass traits of RIL and GWAS population were analyzed separately (Fig. 2). The results showed that the correlations of the biomass traits between the two groups were different. Among them, the strongest correlation was found between dry weight and fresh weight of the GWAS treatment group, and the correlation coefficient was 0.91. In the RIL population, there was also a strong correlation between dry weight and fresh weight of the RIL treatment group, with a correlation coefficient of 0.75. These results suggested that there should be genes in rice seedlings that affected both dry and fresh weight of rice seedlings simultaneously under high temperature stress and responded to high temperature stress in this way. Previous studies have confirmed that seedling biomass traits can reflect rice seedling vigor[19]. But in this study, although the effect of high temperature stress on rice seedlings could be directly reflected by biomass traits, correlation analysis showed that there was a very weak correlation between survival rate traits that directly reflected the vigor of rice seedlings and the biomass of rice seedlings. This indicates that biomass traits are not the unique manifestation in response to high temperature stress in rice, and that survival rates and biomass traits are regulated by different genes.
QTL mapping of RIL
Using the high density map of RIL in 2013[20], QTL mapping of the RIL with or without heat stress treatment was carried out. In total, 20 QTLs, distributed on the chromosomes 2, 3, 4, 6, 7, 9,11, and 12, were detected (Table 1). The detection of multiple QTLs for the same trait indicates that the biomass trait is indeed a quantitative trait. There are several known genes related to plant heat tolerance in the above QTL intervals, including GS8[21] and abl1[22]. All of these known genes confirmed the accuracy of the linkage analysis. The results showed that qDW11, a locus for dry weight trait in the treatment group, had the highest logarithm of odds (LOD) value (6.72). Meanwhile, we found qDW11 and another loci qFW11.1 for fresh weight in the treatment group are two identical intervals, which is also verified by the correlation analysis results (Fig. 2). In the analysis of dry and fresh weight of the control group, the same set of QTLs (qCDW11 and qCFW11) are also identified on chromosome 11. However, the same interval for dry and fresh weight detected in the control group was inconsistent with that in the treatment group, indicating that the QTL identified in the treatment group responded to the influence of high temperature. Therefore, we speculate that qDW11 and qFW11.1 contain genes that respond to high temperature stress, which are not expressed under normal growth conditions, and we will focus on this candidate interval later.
GWAS of heat-related traits in the seedling stage
A genome-wide association analysis of 255 natural rice lines (Table S3) was performed using the existing genome-wide coverage of 14,779,691 SNPs data from 3k database[23]. It is generally considered that LD is between 100 kb and 200 kb in rice[18]. In this study, the SNP threshold for the significantly associated sites is P-value <10-6 and the significant SNPs sites within the 100 kb interval are considered candidate loci (Table S3). There were a large number of different significant loci in the heat treatment and control groups, which suggested that the high temperature in the seedling stage might affect the growth of the natural population through these loci (Fig 3 and S4). According to the stringent screening conditions, only a small number of reasonable SNPs were associated with the shoot length, so the P value was reduced to < 10-5 for subsequent analysis. In this way, a total of 10 intervals for biomass traits were identified in the control groups, while 29 intervals for biomass traits were present on all of the chromosomes other than chromosome 3 in the heat treatment group (Table S3). For the dry weight and fresh weight of the treatment groups, the consistent intervals, qDW_ind8 (3.09Mb-3.29Mb) and qFW_ind8 (3.09Mb-3.29Mb), were identified on chromosome 8. Therefore, there are important candidate genes controlling each biomass traits under high temperature stress in this interval. It can be inferred that the expression of one or more genes in this interval is in response to high temperature stress by adjusting the dry weight and fresh weight characters of rice seedlings.
There are 77 intervals containing 283 SNPs for survival rate after heat treatment (Fig. 4a, Table S3). Among them, qSR_ind9-3 on chromosome 9 contains gene Ugp1, a known heat-tolerant gene at the heading stage[24](Fig. 4c). As Ugp1 is expressed during the whole growth and development period of rice[24], it is suggested that this gene could also had an effect on the rice seedling response to heat stress. To further determine the natural variation of Ugp1, we performed haploid genotype analysis using all of the non-synonymous SNPs in the coding region of this gene (Fig. 4d). A total of 10 SNPs in the coding sequence (CDS) region were detected, and four major haploid genotype were identified based on the 10 SNPs. Hap.1 is the main haploid genotype in nature population, and the varieties belonging to Hap.1 have higher survival rate after high temperature stress treatment in rice seedling stage. It can be found that the expression of Ugp1 does have an effect on the response of rice seedlings to high temperature stress. Among 255 materials, a total of 191 varieties belong to the Hap.1 haploid genotype, which proves that Hap.1 has a higher utilization rate in the population. Compared with Hap.1, the survival rate of Hap.2 after the change of chr9_21920470 was low, but the biomass increased significantly. For Hap.3, all sites except chr9_21920470 changed. Although the survival rate of Hap.3 is also lower than that of Hap.1, the decline rate of survival rate is lower than that of Hap.2. Therefore, we judge that the chr9_21920470 locus has a stronger effect on the survival rate of rice seedlings than other loci. In Hap.4, where chr9_21920592 changes, compared with Hap.1, shoot length is lower. From this, we judge that this site may be related to plant growth.
Compare results from linkage analysis and GWAS
We compared the QTLs obtained by linkage analysis with the candidate intervals identified by GWAS. Two groups of candidate regions identified on chromosome 7 and chromosome 6 were co-located by linkage analysis and association analysis. There was a 200 kb (5.59 Mb-5.70 Mb) coincidence interval in qFW6 and qFW_ind6 for the fresh weight trait in the treatment group. Analyzing the LD attenuation distance of this interval (Fig. 5), we further reduced the candidate interval to 69 kb (5.62 Mb-5.69 Mb). According to MSU V7.0[25] annotated information, there are 13 candidate genes in this interval, including 12 genes involved in expression (Table S4). The same analysis was performed on the interval of 200 kb (17.82 Mb-18.02 Mb) in qDW7 and qDW_ind7 on chromosome 7, we reduced the candidate interval on chromosome 7 to 57 kb (17.90 Mb-17.95 Mb), and identified 6 genes (Table S4). To further investigate candidate genes, we performed homologous analysis on the obtained genes. The results of homologous analysis showed that the homologous gene LECRK-VII.2 of LOC_Os06g10790 was involved in the response to high temperature and drought in Arabidopsis[26]. The homologous gene UGT73B3[27] of LOC_Os06g10860 and the homologous gene UGT85A2[28] of LOC_Os07g30330 are involved in the stress response of Arabidopsis. The homologous gene PHT3[29] of LOC_Os06g10810 is related to the oxidation-reduction reaction of Arabidopsis thaliana. These homologous genes on the one hand confirm the accuracy of the experimental results, on the other hand, further narrow the range of candidate genes in this study.
Transcriptome analysis by RNA sequencing
In order to study the transcriptome response of indica rice to high temperature in the seedling stage, two heat-resistant (HR) varieties and two heat-sensitive (HS) varieties were selected for RNA_seq analysis. The PCA results showed a difference between the heat-resistant group and the heat-sensitive group (Fig. 6a). After comparing the gene expression between the heat-resistant groups and the heat-sensitive groups, it was determined that a total of 529 genes were differential expression in the four groups (Fig. 6c), and it was considered that there were key genes responding to high temperature stress in rice seedling. Kyoto encyclopedia of genes and genomes (KEGG) analysis was performed on 529 differential expressed genes (DEGs). The results showed that 75 DEGs were significantly enriched in pathways such as metabolic pathways, pyruvate metabolism, starch and sucrose metabolism (Table S5). OsCML4[30], which is enriched in the mitogen-activated protein kinase (MAPK) signaling pathway-plant pathway, has been shown to improve the rice tolerance by removing reactive oxygen species and inducing other stress related genes in a form independent of ABA survival rate. This indicates that there are genes in DEGs genes that can respond to high temperature stress for subsequent research. Further, we identified LOC_Os01g09450, LOC_Os03g59040 and LOC_Os12g42980 in the GWAS candidated intervals qSR_ind1-1, qSR_ind3-9, and qSR_ind12-7, respectively. The three genes were located in the KEGG pathway in plant hormone signal transduction, metabolic pathways, and cysteine and methionine metabolism pathways, and the expression of LOC_Os03g59040 was confirmed to be related to the stomatal conductance of rice[31]. We conclude that LOC_Os01g09450, LOC_Os03g59040 and LOC_Os12g42980 could be used as candidate genes for rice seedlings in response to high temperature stress. Based on the gene ontology (GO) enrichment analysis, most genes were enriched in the oxidation reduction, oxidoreductase activity, transmembrane transport and others (Fig. 6b). We focused on the 44 genes enriched in oxidation reduction and oxidation activity activities. Among the 44 genes identified, there are 10 down-regulated genes and 22 up-regulated genes (Fig. 6d, f). After a comparison with these genes and the intervals obtained in GWAS, it was found that LOC_Os02g12890 was located on the qSR_ind2-1 for survival rate. It was suggested that LOC_Os02g12890 could be the candidate gene which may respond to high temperature stress in the rice seedling stage and directly affect the heat tolerance of rice seedlings.
Based on the transcriptome data and the enriched results, we randomly selected 10 genes for quantitative real-time PCR (qRT-PCR) detection. In the differential gene analysis of different groups, the fragment per kilobase million (FPKM) and qRT-PCR results of each of the 10 genes have the same trend (Fig. 7 and S5), which proves that the transcriptome sequencing results are accurate.
In summary, a total of 23 candidate genes were co-localized by two or more methods, including 19 genes co-located by linkage analysis and GWAS and genes in 4 related pathways by GWAS and transcriptome analysis. 8 genes, such as LOC_Os06g10790, LOC_Os06g10860, LOC_Os07g30330, LOC_Os06g10810, LOC_Os01g09450, LOC_Os03g59040, LOC_Os12g42980, LOC_Os02g12890, etc. have been reported to be related to the plant response to heat stress due to homologous analysis or the focus has been reported in the follow-up research on the plant response to heat stress.