Genome-wide identification and bioinformatics analysis of Gh4CL genes
Using the approach described in Materials and Methods, we identified 34 Gh4CL genes in G. hirsutum. They are randomly distributed on 22 chromosomes and an unanchored scaffold that was not assigned to a particular chromosome (Fig. 1a, Table 1). We named them Gh4CL1 to Gh4CL34 based on their chromosomal location. Two pairs of Gh4CL genes, Gh4CL10/11 and Gh4CL21/22, are in tandem configuration on chromosomes A09 and D03, respectively. Segmental duplication could be involved in generation of 12 Gh4CLs based on MCScanX analysis. Phylogenetic analysis using the 34 Gh4CL genes and 4CL genes from A. thaliana, G. max, P. tremuloides, P. trichocarpa, R. idaeus, and I. tinctoria showed that they were clustered into three groups (Fig. 1b). The Ka/Ks ratio of each homologous/paralogous Gh4CL pair is < 1 (Additional file 2: Table S1), suggesting that these Gh4CL genes have experienced purifying selective pressure during their evolution history to eliminate deleterious mutations.
The protein length of Gh4CLs is between 129 and 576 amino acids (aa) with ORF from 390 to 1731 bp, molecular weight from 14.11 to 63.08 KD, and pI from 5.3 to 9.77. Most Gh4CLs seem to be associated with various biomembranes based on subcellular localization prediction (Table 1). Analyses of gene structures and motifs showed that each Gh4CL has multiple exons, introns and motifs (Additional file 1: Fig. S1; Additional file 2: Table S2). All the Gh4CL proteins have two structural domains, a putative AMP-binding domain “SSGTTGLPKG” (Box I) and a conserved domain “GEICIRG” (Box II) (Additional file 1: Fig. S2).
Cis-elements in combination with transcription factors regulate the transcription level of a gene. To identify potential cis-elements involved in regulation of transcription of Gh4CL genes, we scanned cis-elements in the promoter region (2 kb upstream of ATG) of each Gh4CL gene using the online tool PlantCARE [20]. Many Gh4CL genes harbored plant hormone-responsive and/or stress-responsive elements, including ABA responsive elements (ABREs), auxin responsive elements (AuxRR-core, TGA-elements and TGA-box), MeJA-responsive elements (CGTCA-motif, TGACG-motif), gibberellin-responsive elements (TATC-box, GARE-motif and P-box), salicylic acid responsive elements (TCA-elements), low-temperature responsive elements (LTR), defense and stress responsiveness elements (TC-rich repeats) and drought-responsive elements (MBS) (Additional file 1: Fig. S3).
Tissue specific expression patterns of Gh4CL genes
The expression patterns in various tissues provide clue for the possible biological functions of genes of interest. We thus analysed the transcript abundance of the Gh4CL genes in different tissues (root, stem, leaves, flower, ovule and fibers at 5, 10, 15 and 20 days-post-anthesis (DPA)) under normal growth conditions using the publically available RNA-seq data (BioProject Accession: PRJNA248163) [21]. We found that 10 Gh4CL genes were expressed in all the tested tissues [base on fragments per kilobase of transcript per million mapped reads (FPKM) ≥ 1], and 4 genes (Gh4CL3, Gh4CL5, Gh4CL18 and Gh4CL27) showed weak or no expression in all tissues analysed (Fig. 2). In addition, 8 Gh4CL genes (Gh4CL2, Gh4CL4, Gh4CL8, Gh4CL12, Gh4CL17, Gh4CL24, Gh4CL29 and Gh4CL30) were highly expressed (FPKM ≥ 20) in stem, with the highest expression level observed for Gh4CL17 (FPKM ≥ 84) and 6 genes (Gh4CL7-8, Gh4CL12, Gh4CL20 and Gh4CL30-31) were strongly expressed in leaves, with the highest expression level observed for Gh4CL20 (FPKM ≥ 202).
Expression analysis of Gh4CL genes under different abiotic stress conditions
Since 4CL genes are capable of responding to biotic and abiotic stresses in various plant species, we further investigated the transcript abundance of the Gh4CL genes under cold, heat, PEG and salt stresses using the transcriptomic data of G. hirsutum (BioProject Accession: PRJNA248163) [21]. We found that 26 Gh4CL genes were induced significantly by one or more stresses, and the remaining 8 Gh4CL genes (Gh4CL3, Gh4CL5, Gh4CL10, Gh4CL18-19, Gh4CL23, Gh4CL28 and Gh4CL34) were not induced by either of the four stresses (Fig. 3a). Comparing the four stress conditions, more Gh4CL genes showed altered expression in response to salinity, cold and heat stresses than to PEG stress. Notably, ten Gh4CL genes (Gh4CL2, Gh4CL7-9, Gh4CL11-13, Gh4CL17, Gh4CL22, Gh4CL25 and Gh4CL31) showed increased expression (treatment FPKM/control FPKM ≥ 1.5) in response to PEG stress over the 3 h to 12 h time period. To verify these results, we investigated the expression patterns of the selected Gh4CL genes under the simulated drought treatment using quantitative real-time polymerase chain reaction (qRT-PCR). As shown in Fig. 3b, the expression levels of Gh4CL7-8, Gh4CL12-13, Gh4CL17, Gh4CL22 and Gh4CL24 were up-regulated in cotton leaves over the time period of 3h to 24h after PEG stress, consistent with the RNA-seq based results.
Gh4CL7 plays an important role in lignin biosynthesis
Based on the above analysis results of promoter cis-elements and expression patterns under drought stress, three Gh4CL genes, including Gh4CL7, Gh4CL8 and Gh4CL13, were considered as candidate genes with a potential role in the regulation of drought stress response in cotton. In this study, we selected Gh4CL7 for further functional analysis by silencing its expression in cotton and overexpression in Arabidopsis thaliana.
We used VIGS to silence the expression of Gh4CL7 using the TRV vector (TRV:Gh4CL7; Additional file 1: Fig. S4). TRV:GhCHLI was used as a positive control of the VIGS experiment (Additional file 1: Fig. S5). Arabidopsis thaliana plants overexpressing Gh4CL7 (Gh4CL7-OE) were obtained by the floral dip method. Gh4CL7 belongs to class I whose genes have been shown to regulate lignin biosynthesis [10, 22]. We thus first investigated whether or not Gh4CL7 is also involved in lignin biosynthesis by comparing the lignin contents of the Gh4CL7-OE Arabidopsis lines and TRV:Gh4CL7 cotton plants with that of their corresponding control plants. The lignin content increased by approximately 10% in the Gh4CL7-OE lines compared with wild-type (WT) plants (Fig. 4a), while decreased by approximately 20% in the TRV:Gh4CL7 plants compared with the TRV:00 plants (Fig. 4b). Additionally, the stem of the TRV:Gh4CL7 plants were sectioned and stained with phloroglucinol-HCl to detect the presence of lignin (Fig. 4c). We found that the stem section of the TRV:Gh4CL7 plants with reduced lignin content exhibited light red color, but the TRV:00 plants displayed typically purple-red color after phloroglucinol-HCl staining. These results suggested that Gh4CL7 is related to lignin synthesis. We also analysed the expression level of the phenylpropane pathway genes that are related to lignin biosynthesis, including GhPAL, GhCCoAOMT1, GhCOMT1, GhCOMT2, GhCOMT3, GhCCR1, GhCCR2, and GhCAD. The relative expression level of these genes were lower in the TRV:Gh4CL7 plants than in TRV:00 (Fig. 4d), indicating that Gh4CL7 could affect the accumulation of lignin by regulating the transcription level of these downstream genes of the lignin biosynthesis pathway.
Silencing of Gh4CL7 compromises tolerance of cotton to drought stress
Phenotypic difference between the TRV:Gh4CL7 and TRV:00 plants was observed after 20 days of water deficiency treatment. Compared to the TRV:00 plants, the TRV:Gh4CL7 plants displayed severe wilting and yellowing leaves (Fig. 5a), consistent with a lower leaf relative water content (RWC) (Fig. 5b) and a decrease chlorophyll contents (Fig. 5c). Besides, it was also found that the size and the ratio of width to length of stomata significantly increased in the TRV:Gh4CL7 plants (Fig. 5d-f), which might accelerate the transpiration rate under drought conditions, consistent with the observed higher water loss relative (WLR) (Fig. 5g). The hydrogen peroxide (H2O2) content and malondialdehyde (MDA) level were measured to reflect the cell damage or injury in TRV:Gh4CL7 and TRV:00 plants. During drought stress, the TRV:Gh4CL7 plants accumulated more MDA (Fig. 5h) and H2O2 (Fig. 5i) compared to the TRV:00 plants. The activities of superoxide dismutase (SOD), peroxidase (POD) and catalase (CAT) in the TRV:Gh4CL7 and TRV:00 plants were also measured to explore the function of Gh4CL7 in the modulation of antioxidant enzymes (Fig. 5j). As expected, under drought stress conditions, the TRV:Gh4CL7 plants displayed a significantly reduced activity of SOD, POD and CAT as compared to the TRV:00 plants. Additionally, six stress-related genes (GhABI4, GhABF4, GhLEA14, GhRD22, GhRD29 and GhNCED1) were down-regulated in the TRV:Gh4CL7 plants after drought treatment (Additional file 1: Fig. S6). These results suggested that silencing of Gh4CL7 decreases tolerance of cotton to drought stress.
Overexpression of Gh4CL7 in Arabidopsis enhances drought tolerance
We further investigated the function of Gh4CL7 in response to drought stress using Arabidopsis plants overexpressing Gh4CL7. Three independent Gh4CL7-OE lines that showed an elevated level of Gh4CL7 (Fig. 6a) were selected for phenotyping under the drought stress conditions. Compared to the WT plants, the three Gh4CL7-OE lines had a decreased germination rate (Fig. 6b), but showed a significantly increased root length under the mannitol stress conditions (Fig. 6c, d). Three-weeks-old seedlings of Gh4CL7-OE and WT were used for water deficiency treatment. No obvious phenotypic difference was observed between Gh4CL7-OE and WT by the mock treatment. However, the Gh4CL7-OE plants showed much less damage than WT after 10 days of water deficiency (Fig. 6e). Under drought stress conditions, the H2O2 content and MDA level in the Gh4CL7-OE plants were relatively lower than that in WT (Fig. 6f-g), but the SOD, POD and CAT activities were significantly higher (Fig. 6h). Additionally, the size and the ratio of width to length of stomata significantly decreased in the Gh4CL7-OE Arabidopsis plants (Fig. 7a-b), consistent with a lower WLR observed in those plants (Fig. 7c). To further elucidate the possible mechanism of Gh4CL7 in response to drought stress, the transcript levels of four known ABA-responsive genes (AtRD29B, AtRD22, AtABI4, AtCOR15A) and two ABA-biosynthesis genes (AtNCED3 and AtNCED5) were analyzed in the Gh4CL7-OE lines and WT plants after drought stress treatment. The qRT-PCR data showed that the expression levels of these genes were induced in Gh4CL7-OE, but not or just slightly induced in WT by drought stress (Additional file 1: Fig. S7). These results indicated that overexpression of Gh4CL7 could enhance the tolerance of transgenic Arabidopsis plants to drought stress.