Phenotypic variation of lint yield components under salt conditions
SBW, LP and BNPP were measured in 316 upland cotton accessions (Additional file 1: Table S1) grown under four different salt conditions (Additional file 2: Fig. S1A and Additional file 3: Table S2). The average values of SBW, LP and BNPP traits ranged from 5.28g to 6.84g (Additional file 4: Table S3), 37.89% to 40.39% (Additional file 5: Table S4) and 5.01 to 7.74 (Additional file 6: Table S5) under the four different salt conditions, respectively. BNPP had the largest coefficients of variation (CV), ranging from 22.53% to 48.08%, and LP had the smallest (8.15%~12.68%). To decrease the environmental errors, we further evaluated the best linear unbiased prediction (BLUP) value of the three traits in the same salt conditions. The BLUP value showed that the distribution of SBW was 5~7 g (Additional file 2: Fig. S1B) and LP at 35~45% (Additional file 2: Fig. S1C). However, the BNPP showed a larger difference of 3~7 in salt conditions A and B, and 5~9 in salt conditions C and D (Additional file 2: Fig. S1D).
Paired-samples t-tests were conducted to further investigate the phenotypic variation. The SBW was slightly higher in condition A than in conditions B, C and D (Fig. 1A); however, no significant difference in LP was detected between the four salt conditions (Fig. 1B). Most significantly, the BNPP decreased with the increase in salt concentration (Fig. 1C). In addition, the LYPP was also decreased with the increase in salt concentration, mainly due to changes in BNPP (Fig. 1D). We compared the differences in SBW, LP, BNPP and LYPP at the highest and lowest salt conditions, and found that SBW increased by 5.29%, LP remained unchanged, BNPP decreased by 17.75%, and LYPP decreased by 16.79%. Taken together, these results suggest that the lint yield of cotton is mainly affected by a decrease in boll number, although SBW was slightly increased. This might be due to severely low boll numbers under high salt conditions.
Correlation analysis was conducted on SBW, LP and BNPP under four salt conditions (Additional file 7: Fig. S2). The results showed that there was little correlation among the three traits. However, LP values under different salt environments were highly correlated, with R values ranging from 0.75 to 0.83, and this was followed by SBW, which had R values ranging from 0.43 to 0.63. BNPP showed no or weak correlation in four salt conditions, further indicating that BNPP was largely affected by salt concentration.
GWAS of lint yield related traits
Using the genotypic data of 57,413 high-quality SNPs [18], a GWAS was conducted for the three traits in different environments and the BLUP values, using six methods (“mrMLM”, “FASTmrMLM”, “FASTmrEMMA”, “pKWmEB”, “ISIS EM-BLASSO” and “pLARmEB”) of multi-loci mixed linear model (MLM) model in the R package “mrMLM” [19]. In total, 854 quantitative trait nucleotides (QTNs) on 26 chromosomes were identified as significantly associated with the three traits (Additional file 8: Table S6). We referenced the linkage disequilibrium (LD) in previous report [18] and calculated the average LD from each chromosome, further selected the lowest LD, about 200 kb, as the LD threshold, for merging QTNs into the same QTL. In total, 600 QTLs, including 151 of SBW, 417 of LP and 112 of BNPP, were detected 1446 times under different environments and BLUPs by six multi-loci MLM models (Table 1). For each trait, most QTLs (85 of SBW, 235 of LP and 65 of BNPP) were detected only once, implying that these QTLs are apt to be affected by environmental conditions (Additional file 9: Fig. S3). To improve the reliability and stability of associated QTLs, we selected those that were detected three or more times across different methods or environments as stable QTLs for further analysis. As a result, 42, 91 and 25 QTLs were identified in SBW, LP and BNPP, respectively (Table 1).
The chromosomal distribution showed that these stable QTLs were widely distributed on 26 chromosomes, with more QTLs of SBW and BNPP located on the At sub-genome than on the Dt sub-genome, while QTLs of LP showed opposite (Fig. 2A). Most SBW QTLs were located on chromosomes A11 and A12, LP on A08, D06 and D13, and BNPP on A05, A12 and D07 (Fig. 2B). A Venn diagram of these stable QTLs showed that no co-localized QTL was detected within three traits and most QTLs were specific for individual traits (Fig. 3A), implying great differences in the genetic control of the three traits. We further analyzed the QTLs of each trait under different salt conditions. Only one QTL of SBW was detected under all four salt conditions, and no overlapping QTLs were detected in BNPP (Fig. 3B, C). However, most QTLs of LP were detected repeatedly under different salt conditions: there were 8 QTLs of LP detected simultaneously under all four salt conditions (Fig. 3D). These results suggest complex and variable regulatory mechanisms for SBW and BNPP under different salt conditions, but stable and highly heritable LP regulation against salt stress.
Identification of candidate genes in QTLs
Potential candidate genes in these stable QTL regions were extracted based on the released G. hirsutum TM-1 genome [20]. In total, 1166, 2748, and 711 candidate genes were identified in the stable QTL regions for SBW, LP and BNPP, respectively (Fig. 2C), with most genes distributed on chromosomes A12, D13 and A12 for SBW, LP and BNPP, respectively (Fig. 2D). With GO analysis, the genes in QTL regions for SBW were found to be enriched in “embryo development” and “regulation of cell shape” (Additional file 10: Fig. S4 and Additional file 11: Table S7). The genes from LP QTLs were enriched in several pathways, including hormone and ROS regulation pathways, such as “regulation of gibberellic acid mediated signaling pathway”, “positive regulation of reactive oxygen species metabolic process” and “brassinosteroid biosynthetic process”. They were also enriched in carbohydrate metabolism processes, such as “glucose metabolic process” and “hexose biosynthetic process”, which was consistent with the previous reports that these GO items play crucial roles in fiber development [11, 21]. In addition, we identified 14, 21 and 10 genes related to “Golgi vesicle transport”, “plant-type secondary cell wall biogenesis” and “glucose metabolic process”, respectively, however, none of these process-related genes were found in the QTL regions of BNPP or SBW (Additional file 12: Fig. S5 and Additional file 11: Table S7). The function of genes associated with BNPP were mainly enriched in “mitotic cell cycle”, “ion transmembrane transport” and “polysaccharide catabolic process” (Additional file 13: Fig. S6 and Additional file 11: Table S7), implying that BNPP is closely related to stress responses.
Genes relevant to LP
Via tissue and organ transcriptome profiling [22], we identified 182 genes from LP QTLs with predominant expression during fiber development. Of them, 29, 35, 73 and 45 genes were highly expressed at 10 DPA, 15 DPA, 20 DPA and 25 DPA, respectively (Additional file 14: Fig. S7 and Additional file 15: Table S8). Further, we focused on the genes in the regions of 8 stable QTLs under all four salt conditions and identified 10 genes predominantly expressed during fiber development (Table S9). Of them, Gh_A05G2488, encoding an auxin transport facilitator family member called PIN-FORMED LIKES proteins (PILS), was located in a high frequency associated QTL (A05: 32377816-33100112) which was detected 21 times with multiple methods and environments. Auxin is essential for plant growth and development, including cotton fiber development. Prominent auxin carriers with fundamental importance during plant development are PIN-FORMED (PIN) proteins [23, 24]. Two genes, Gh_D13G0342, which encodes RAB GTPase homolog G3F (RABG3F), and Gh_A10G2138, which encodes prenylated RAB acceptor protein 1 (PRA1), were located on QTL (D13: 3297533-3912736) and QTL (A10: 99896949-100396471), respectively. Previous reports suggested that RAB genes play crucial roles in cell polarity growth, including elongation of pollen tubes [25], root hairs [26], and cotton fibers [27].
Cotton seed transfer cells are enriched in callose, which regulates fiber elongation and secondary wall thickening [28]. Zhang et al. (2007) showed that the transcription factor MYB103 affects callose dissolution during the anther development in Arabidopsis [29]. We found that Gh_D03G1419, which encodes a MYB103 transcription factor (named GhMYB103), was located in a high frequency associated QTL (D03: 42299450-43529568). Sequence analysis showed that two QTNs associated with LP were located in the 3739 bp upstream (TM55216, D03: 42969311) and exon (TM55217, D03: 42973276) regions of the gene, respectively (Fig. 4A). A single nucleotide mutation (from C to G) at the TM55217 locus led to a change of amino acid from leucine (L) to valine (V) (Fig. 4A). Through a Student’s t test, we found that the LP values with the A genotype in TM55216 were significantly higher than with the G genotype (Fig. 4B), and with the G genotype in TM55217 significantly higher than with the C genotype (Fig. 4C). The two QTNs could generate 3 haplotypes; H1: AG, H2: AC and H3: GC. The LP values with AG and AC haplotypes were significantly higher than that with GC haplotypes. However, there was no significant difference between AG and AC haplotypes, implying that QTN TM55216 might play more important roles in LP (Fig. 4D).
In addition, we integrated the LP QTLs with GWAS signals published in previous reports [11, 12], and found three QTLs (A12: 602614-743324; D13: 58792627-59289811; A09: 4676815-5076815) that overlapped with GWAS signals. The three QTLs were detected in different salt conditions. Further, 10 genes from the QTL regions were identified to be predominantly expressed during fiber development (Additional file 16: Table S9). Of them, ABP1 [30] and CPK17 [31] were related to hypocotyl growth or fiber development.
Taken together, the characteristics of genes from these stable LP QTLs and their predominant expression during fiber development suggests that they could play an important role in the LP improvement in breeding practice.
Genes relevant to BNPP
The QTLs of BNPP were easily affected by changes to the environment. No repeated QTLs were detected under all four salt conditions and most QTLs were identified in only one or two salt environments. For example, QTLs TM57617_TM57620 (D05: 24.3-24.8 Mb, detected with 16 times) and TM74225 (D10: 24.1-24.5 Mb, detected with 7 times) were identified in salt conditions C and D. However, QTL TM52041_TM52044 (D02: 49.2-49.7 Mb, detected with 6 times) was identified only in salt condition A, and QTL TM29006 (A08: 81.7-82.1 Mb, detected with 5 times) was identified only in salt condition B. These results indicate that the genes regulating the number of bolls varied in different salt conditions.
A large number of items related to ion transport were found to be enriched through GO analysis (Additional file 13: Fig. S6 and Additional file 11: Table S7). In order to further explore the relationship between stress tolerance and boll number, we performed RNA-seq analysis using the salt stress transcriptome of Gossypium acc. TM-1. With the filter of FPKM ≥ 3, 500 genes (446 in roots and 395 in leaves) in QTL regions were obtained. With GO annotation and differential expression analysis, we focused on 40 salt-inducible stress response genes. Of them, 6 were commonly identified in roots and leaves (Additional file 17: Fig. S8, Additional file 18: Table S10 and Additional file 19: Table S11). Gh_A04G1216 encodes a high-affinity K+ transporter 1 (HKT1). AtHKT1 limits root-to-shoot sodium transportation and is believed to be essential for salt tolerance in Arabidopsis thaliana [32]. Gh_A05G3239 encodes a peroxidase superfamily protein (POD), which has been shown to play an important role in anti-oxidation under salt stress in cotton [33]. Gh_A08G1183 encodes a mitogen-activated protein kinase (MAPK), which has been widely reported to be associated with salt tolerance in cotton [34]. In addition to the ion transport GO term, carbohydrate metabolism was active under salt stress, for example, 10 genes were identified in the enriched GO term “polysaccharide catabolic process” and 3 of them were differentially expressed under salt stress. Cell cycle regulation is of pivotal importance for plant growth and development [35]. A total of 33 genes were identified in the enriched GO term “cell cycle”, 8 of which were differentially expressed under salt stress. These candidate genes could contribute to increasing boll number under different salt environments in cotton.
In order to explore the key genes and mechanisms of boll number regulation under high and low salt conditions, we compared the genes located in the QTL regions of BNPP between conditions A and D. In total, 204 and 265 genes were identified under salt conditions A and D, respectively. GO enrichment analyses showed significantly different functional classification of genes between high and low salt conditions. GO terms related to “polysaccharide metabolic process”, “carbohydrate catabolic process” and “cell wall organization” were enriched under high salt condition A (Fig. 5A, Additional file 20: Table S12), while “cell cycle”, “ion transmembrane transport” and “regulation of signal transduction” were enriched under low salt condition D (Fig. 5B). This indicates that carbohydrate metabolism and cell structure maintenance play crucial roles under high salt conditions, while ion transport is a basal process that is more important under low salt conditions. Under high salt conditions, several candidate genes associated with BNPP were detected. Gh_A11G1551 encodes a proline dehydrogenase 1 (ProDH1), also called early responsive to dehydration 5 (ERD5), which has been studied extensively, especially under abiotic stress [36]. For energy metabolism, Gh_A05G1912 encodes an isoamylase 3 (ISA3), which contributes to starch breakdown. Under low salt conditions, three candidate genes were identified to play important roles in the balance between sodium and potassium. In detail, Gh_A10G0441 encodes a potassium transporter 1 (KUP1) [37], Gh_A12G0074 encodes a high affinity K+ transporter 5 (HAK5) [38] and Gh_A12G0061 encodes a sodium hydrogen exchanger 2 (NHX2) [39].