Identification of the WRKY transcription factors in P. glaucum
The HMMSCAN search resulted in the identification of 97 WRKY (PgWRKY1 to PgWRKY97) transcription factors from the complete proteome database of P. glaucum. Further, protein sequence length, molecular weight (MW), isoelectric point (pI) and other indexes were analyzed for all identified 97 PgWRKYs of P. glaucum. We observed that the sequence length of the WRKY proteins varies from 123 amino acids (PgWRKY16) to 1394 amino acid residues (PgWRKY85). Their MW ranges from 13.732 to 156.285 kDa, and the pI ranges from 4.49 to 10.29 (Additional file 1).
Classification of PgWRKY proteins and phylogenetic analysis
The PgWRKY proteins were examined for conservation of the WRKY domain using multiple sequence alignment. As shown in Figure 1, the sequences with amino acid conservation were shown in blue to red colour index where blue indicates the least and red means highly conserved patches. Multiple sequence alignment showed high conservation of “WRKYGQK” motif and “zinc-finger motif” in all identified PgWRKYs. Identified 97 PgWRKY proteins were classified into three groups based on the number of WRKY domains and structure of zinc-finger motif. Among the identified 97 PgWRKYs, we observed 9 PgWRKYs belongs to Group I; 47 PgWRKYs belong to Group II (forming the largest group); 29 PgWRKYs belong to Group III. Furthermore, we did not observe an intact zinc-finger motif in remaining 12 PgWRKYs. This is consistent with earlier studies conducted on Setaria italica, Gossypium hirsutum and Musa balbisiana [28, 29, 32]. Hence, these 12 PgWRKYs were kept in a separate group (Group IV; uncharacterized). Most of the PgWRKYs contain the conserved “WRKYGQK” motif, whereas few PgWRKYs have slight variations in their signature motif (Additional file 2).
A phylogenetic study was performed to analyze the evolutionary relationships among the WRKY families of A. thaliana, O. sativa, S. italica and P. glaucum. A total of 379 WRKY proteins including 72 from A. thaliana, 105 from O. sativa, 105 from S. italica, and 97 from P. glaucum were used to construct a phylogenetic tree as described in the method section. As shown in Figure 2, all 379 WRKYs were clustered across major clades. We observed WRKY members belonging to a specific group (I, II, III) of all analyzed species were also clustering to the same clade (highlighted in Figure 2).
Chromosomal distribution and structure analysis of PgWRKY genes
Identified PgWRKYs were mapped on seven chromosomes of P. glaucum (Figure 3). Eighty-eight PgWRKYs were unevenly distributed across the P. glaucum genome. Remaining 9 PgWRKYs were not mapped due to unavailability of chromosomal coordinates in the genome database. Most of the PgWRKYs were abundant on 1st (22 genes; ~23%) and 6th (21 genes; ~22%) chromosomes whereas least were found on 5th and 7th (6 genes each; ~6%) chromosomes. A total number of 19 PgWRKYs were located at the telomere region of chromosome 1, while 17 PgWRKYs were traced at the centromere region of chromosome 6. WRKY members of all groups were present on all chromosomes except chromosome 2 and 3, where Group I and IV members were not present respectively (Additional file 3; Figure S1).
The structural features of identified PgWRKY genes were examined in detail using the GSDS server. Figure 4 showed the varying pattern of total exonic and intronic regions in identified 97 PgWRKYs. Among 88 PgWRKYs, the majority of PgWRKY genes (46.59%) had two introns and three exons; followed by 15 PgWRKYs with one intron and two exons;17 PgWRKYs with three introns and four exons; 7 PgWRKYs with four introns and five exons; 3 PgWRKYs with five introns and six exons; 2 PgWRKYs with six introns and seven exons; 1 PgWRKY with seven introns and eight exons; 1 PgWRKY with sixteen introns and seventeen exons. However, PgWRKY47 had no introns (Additional file 1). We also observed variation in gene size of identified PgWRKYs, which was ranging from 476 bp (PgWRKY47) to 10991 bp (PgWRKY26).
Further, the motif analysis was performed to identify the conserved motifs present in PgWRKYs using the MEME suite. Schematic presentation of motifs (Figure 5) revealed that PgWRKYs contain different types of conserved motifs. We identified ten conserved motifs and named them as motif 1 to motif 10 in 97 PgWRKYs. Motif 1 (WRKY motif) was widely distributed in all members of PgWRKY family and motif 8 (WRKY motif) was only present in Group I members. We also observed group-wise specific motif conservation, i.e., motif 4 was found only in Group I members. Similarly, motif 3 was found to be present only in Group III members. We observed Group II members have a different motif distribution pattern according to subgroups (IIa-IIe), such as motif 2 was specific in Group IIa and IIb; motif 7 in Group IIb; motif 5 in Group IIc and motif 6 in Group IId members. We did not find any conserved motif in Group IIe. Group IV members did not possess any specific motif; however, motif 2, motif 7 and motif 5 were partially conserved in few members of Group IV (Additional file 4).
Synteny relationship and selection pressure analysis of WRKY orthologous genes
Additionally, we attempted to identify the duplication event and analyzed the synteny relationship among the WRKYs of P. glaucum, A. thaliana, O. sativa and S. italica. A total number of 33 chromosomes (P. glaucum– 7, A. thaliana- 5, O. sativa-12, S. italica– 9) with a total number of 370 WRKYs (P. glaucum– 88, A. thaliana- 72, O. sativa-105, S. italica– 105) were used to map the synteny relationships. In Figure 6, the WRKYs that were involved in segmental duplication and orthologous events were presented by different coloured lines. PgWRKYs from Chromosome 1 (PG1) and Chromosome 6 (PG6) having orthologous pairs with AT1, AT4, AT5 (A. thaliana); SI3, SI5 (S. italica) and OS1, OS5 (O. sativa) chromosomes, indicating hot-spots of PgWRKYs distribution. A total number of 10 pairs were tandemly duplicated and 13 pairs were segmentally duplicated (Additional file 5). We found 97 orthologous pairs of PgWRKYs among WRKYs of A. thaliana, O. sativa and S. italica (Additional file 6). The Ks/Ka ratio of all identified collinear pairs was less than 1, indicating synonymous substitution or purifying selection of PgWRKYs during evolution (Additional file 7).
Gene ontology annotation and cis-regulatory elements analysis
Gene ontology (GO) annotations of 97 PgWRKY proteins were predicted using protein blast in Blast2GO tool. Based on identified GO terms, the involvement of identified PgWRKY proteins in various biological processes, cellular components and molecular functions were shown in Figure 7. A majority of biological processes were predicted to be involved in different metabolic pathways and response to stress conditions. The molecular functions of these proteins predicted to be involved in sequence specific DNA-binding transcriptional activity. The promoter analysis of PgWRKY genes was done using PlantCARE database by taking 1.5 kb upstream region. A total number of 127 cis-regulatory elements (CREs) were identified among PgWRKY genes (Additional file 8). These cis-elements in PgWRKYs were found to be specific to abiotic stress (ABRE, ARE, DRE, HSE, LTR, MBS, ACE, AE-Box, MNF1, MRE, SP1, etc.); biotic stress (EIRE, ELI-Box3, BoxW1, WUN motif); hormonal; physiological and plant development process (AUX RR, CE1/3, GCN4 motif, SARE, MBS-I/II, MSA like, SKN motif, AS1/2, dOCT). Presence of such versatile cis-elements reflecting the functional divergence of PgWRKYs in P. glaucum (Figure 8).
Relative expression analysis of PgWRKYs
WRKY transcription factors are well-known for their regulatory function in various stress signaling pathways. Twenty-five PgWRKY genes were selected based on their sequence similarity, blast analysis, motif conservation, synteny and phylogenetic relationship with well-characterized WRKY genes of other species that are shown to be involved in abiotic stress tolerance (Additional file 9). These selected genes were subjected to transcript abundance analysis using qRT-PCR to check their relative expression in different tissues and their probable involvement in dehydration and salinity stresses. PCR conditions for these PgWRKY genes were standardized by using PgWRKY specific primers (Additional file 10) with genomic DNA of pearl millet as a template (Additional file 11; Figure S2).
Tissue specific expression analysis was performed in leaf, stem and root tissues of pearl millet. Expression analysis showed that 22 PgWRKYs were expressed in at least one of the selected tissues (Figure 9, Additional file 12; Figure S3). While PgWRKY16, PgWRKY39 and PgWRKY55 were not expressed in any of the analyzed tissues. PgWRKY4, PgWRKY18 and PgWRKY96 were predominantly expressed only in root tissues. Moreover, the majority of the PgWRKYs were showing less expression in stem compared to leaf and root tissues, except PgWRKY41 and PgWRKY44. Additionally, 8 PgWRKYs (PgWRKY2, PgWRKY3, PgWRKY28, PgWRKY46, PgWRKY52, PgWRKY56, PgWRKY74 and PgWRKY92) showed relatively higher expression in leaves compared to stem and root; 10 PgWRKYs (PgWRKY4, PgWRKY6, PgWRKY18, PgWRKY33, PgWRKY59, PgWRKY62, PgWRKY65, PgWRKY72, PgWRKY76 and PgWRKY96) showed relatively higher expression in roots compared to leaf and stem; 2 PgWRKYs (PgWRKY41 and PgWRKY44) showed relatively higher expression in stem compared to leaf and root tissues, while remaining 2 PgWRKYs showed similar expression pattern.
The expression patterns of selected PgWRKYs were analyzed under drought and salt stress conditions at different time points using qRT-PCR. As shown in Figures 10a and 10b, most of the PgWRKYs showed differential expression levels under both dehydration and salinity stress conditions at different time points. Specifically, under drought stress condition, we observed the upregulation of six PgWRKYs and downregulation of nine PgWRKYs in terms of their transcript abundance. We found the expression level of PgWRKY96 and PgWRKY61 were significantly induced under drought stress condition. Whereas PgWRKY2, PgWRKY6, PgWRKY52, and PgWRKY74 were significantly downregulated. Similarly, under salt stress condition five PgWRKYs were upregulated and nine PgWRKYs were down regulated when compared to control samples. The salt treated plants showed significant upregulation of PgWRKY62 and downregulation of PgWRKY33, PgWRKY44, PgWRKY59, PgWRKY61 and PgWRKY65, compared to control plants at respective time points. The transcript abundance profile of PgWRKY62 showed similar upregulation pattern in both drought and salt stress conditions. Likewise, the expression pattern of PgWRKY33 and PgWRKY65 was found to be downregulated in both drought and salt stress conditions. In addition, we could not detect the transcripts of PgWRKY4, PgWRKY16, PgWRKY39 and PgWRKY55 in both semi-quantitative as well as in quantitative RT-PCR analysis under drought and salt stress treatments.