Assembly of the transcriptome of V. planifolia roots exposed to Fov
The transcriptome of vanilla roots exposed to Fov was assessed with Illumina sequencing at 2 and 10 dpi. A total of 12 cDNA libraries were paired-end sequenced using the NextSeq 500 system. Sequencing data of these libraries were obtained corresponding to three biological replicates (control and treatment), covering two frames of time along the infection process (Fig. 1). In brief, six libraries corresponding to control and treatment at 2 dpi, as well as six libraries at 10 dpi, produced more than 204 million reads (Fig. 1). Such data were submitted to the GEO platform of NCBI-GenBank (Accession number: GSE134155). To analyze and compare the dispersion of the treatments with respect to the control samples, PCA analyzes were carried out (Additional file 1: Figure S1; and Additional file 2: Figure S2). Figure S1 corresponds to the treatment and control at 2 dpi, whereas Figure S2 corresponds to the treatment and control at 10 dpi; respectively. Pre-processing of raw sequencing reads was carried out with FastQC, which indicated a good per base quality. The results of the quality analysis applied to the raw data, with FastQC software, are hosted under the following link: http://www.uusmb.unam.mx/reportes/170308/Project_MTulio.html (Additional file 3: Table S1). Filtering reads that correspond to the pathogen used at 2 and 10 dpi, discarded 5.33% and 39%, respectively. On the other hand, even that control plants were not inoculated with the fungus, 5% (2 dpi) and 6.48% (10 dpi) of reads aligned to the genome of F. oxysporum f. sp. lycopersici, excluding such reads for subsequent analyzes (Fig. 1). The de novo transcriptome assembly of vanilla resulted in about 45,000 transcripts (Additional file 4: Figure S3). The statistics of the transcriptome assembly carried out by TransRate v1.0.3 [34] can be found in Additional file 5 (Table S2) (Accession number: GSE134155). In addition, the results of processing data, which involved sequencing statistics of raw data and filtered data, statistics of the sequence alignments vs. the de novo transcriptome assembly [35], as well as non-aligned sequences and records of sequences that were cleaned are presented in Additional file 3 (Table S1). The generated transcripts were mapped against the plant databases, using the BUSCO software, obtaining about 99% of complete orthologous. Figure S1 shows the results of the annotation of the vanilla transcriptome with Blast2GO, finding about 11,000 unigenes out of the total 45,000 assembled transcripts (Additional file 4: Figure S3). Among the main functional categories of gene ontology obtained were plant development, plant growth, cell proliferation, signaling, response to stimuli and response to stress. Moreover, counting of reads on the assembled transcripts resulted in approximately 30% of transcripts that fulfilled the counts per million required for the subsequent identification of DEGs. Altogether, the assessment of the transcriptome of V. planifolia roots exposed to Fov revealed that several plant and cellular processes are impacted during the two frames of time evaluated.
Analysis of gene expression and functional categorization of DEGs at 2 and 10 dpi
For the identification of unigenes with changes in expression levels at 2 and 10 dpi, differential gene expression analysis was carried out using several approaches such as DESeq, DESeq2, NOISeq and EdgeR. For libraries corresponding to 2 dpi, 2310, 1702, 4080 and 3420 DEGs were obtained with DESeq, DESeq2, NOISeq and EdgeR, respectively (Fig. 2) (Additional file 6: Table S3). On the other hand, analysis of DEGs at 10 dpi revealed that 812, 534, 839 and 881 DEGs were obtained with DESeq, DESeq2, NOISeq and EdgeR, respectively (Fig. 2) (Additional file 6: Table S3). As EdgeR is the most popular method and taking into account that this method included the vast majority of DEGs, EdgeR was selected for the subsequent analysis (Fig. 2). In that sense, two lists were obtained, one corresponding to the treatment at 2 dpi containing 3420 DEGs and the other corresponding to the treatment at 10 dpi with 881 DEGs. In the case of DEGs at 2 dpi, 1563 genes were found to be up-regulated, whereas 1857 genes were down-regulated. On the other hand, classification of DEGs at 10 dpi as up- and down-regulated genes, resulted in 250 and 631 genes, respectively. An overview of the transcriptional change at 2 and 10 dpi is shown in Fig. 3. At a glance, subsets of certain DEGs showed contrasting expression profiles if both treatments are compared (Fig. 3).
The lack of reference genome for V. planifolia forced to check orthology with available genomes for which annotation is complete. Accordingly, orthologs of Arabidopsis corresponding to DEGs at 2 and 10 dpi were obtained, resulting in 603 and 278 orthologs, respectively (Additional file 7: Table S4). As a first approach to elucidate the putative functions of DEGs at 2 and 10 dpi, gene orthologs were submitted to MapMan [36]. Pathway analysis of DEGs with P-value cut-off of ≤0.05 was carried out on Arabidopsis pathway genes. Accordingly, 603 (2 dpi) and 278 (10 dpi) DEGs were analyzed with MapMan, from which only 535 and 149 were categorized, respectively (Fig. 4). Visualization of the DEGs assigned to functional categories revealed that orthologs with differential expression at 2 dpi showed most enriched categories than that of 10 dpi (Fig. 4). Remarkably, most of data points contained within the functional categories at 2 dpi were up-regulated genes, whereas down-regulated genes were mostly associated to functional categories at 10 dpi (Fig. 4). Among the most enriched categories, genes encoding products involved in regulation of transcription (27) and protein synthesis (29) were observed in both cases (2 and 10 dpi) (Fig. 4). However, the number of genes associated to those functional categories was contrasting. For instance, whereas only 20 data points were found within the category of protein for DEGs at 10 dpi, 123 were found in the case of data corresponding to 2 dpi (Fig. 4). Other enriched categories at 2 dpi were cell wall (10), lipid metabolism (11), amino acid metabolism (13), secondary metabolism (16) and hormone metabolism (17) (Fig. 4). Thus, the functional categorization of DEGs suggest that the major transcriptional change occurs at early stages of infection, namely at 2 dpi.
To further comprehend the functions of DEGs at 2 and 10 dpi, an analysis according to the enrichment of GO terms was carried out using the orthologs. Such analysis in the platform of agriGO resulted in the main functional categories associated to DEGs at 2 dpi (Table 1), but not for DEGs at 10 dpi. For up-regulated genes at 2 dpi, 18, 4 and 30 categories were significantly enriched, corresponding to biological process (P), molecular function (F) and cellular component (C), respectively (Table 1). The GO Term Enrichment Analysis was also performed with the PANTHER classification system software (v.14.0), to determine the categories of Gene Ontology significantly enriched in DEGs at 2 dpi, present in the up and down-regulated transcripts (Additional file 8: Table S5, and Additional file 9: Table S6 respectively). In the case of down-regulated genes, 9, 3 and 3 categories were found significantly enriches for P, F and C, respectively (Table 1). A schematic representation of biological processes shown in Table 1 allowed to appreciate that translation and cell wall modification were the main processes overrepresented in up- and down-regulated genes, respectively (Additional file 10: Figure S4). Taken together, the functional categorization of DEGs not only suggests that the major transcriptional change occurs at early stages of infection (2 dpi), but also indicates that up-regulated genes are mainly associated to translation, whereas down-regulated genes are involved in cell wall remodeling.
Functional association networks of DEGs at 2 dpi
Since the functional categorization suggested that DEGs coding for protein-related processes are the most contrasting categories when datasets of 2 and 10 dpi are compared, further inspection was carried out for DEGs at 2 dpi. Since genes encode products that interact each other, a network was generated to look for relationships among DEGs at 2 dpi (Fig. 5). Briefly, out of 309 up-regulated genes at 2 dpi with an ortholog in the Arabidopsis genome, only 282 were recognized by String [37]. Accordingly, most of interactions observed in the network corresponded to experimental data (purple lines) (Fig. 5A). Particularly, a central network was formed, involving 98 genes (nodes), from which most of them (80 nodes) were related to translation (structural constituents of ribosome) (Fig. 5A). For example, genes encoding proteins such as Ribosomal protein S12/S23 (Rps12/s23), Ribosomal protein l24B (Rpl24B), Ribosomal protein s6 (Rps6), Ribosomal protein s13 (Rps13), among others, formed the central network (Additional file 12: Table S7). In addition to genes involved in translation, genes associated to development were also found (Additional file 12: Table S7). Among this group, YODA (YDA), STEROL METHYLTRANSFERASE 1 (SMT1), EMBRYO DEFECTIVE 2386 (EMB2386), MATERNAL EFFECT EMBRYO ARREST 22 (MEE22), and others, were found (Additional file 12: Table S7).
On the other hand, for down-regulated genes at 2 dpi, 256 were recognized by String out of 294 submitted genes (Fig. 5B). Also, a central network (49 nodes) was obtained with genes involved mainly in cell cycle, DNA replication and cell wall organization (Fig. 5B) (Additional file 12: Table S7). In this case, genes encoding proteins such as Cyclin A1;1 (CycA1;1), Cyclin-dependent kinase B2 (CdkB2), Minichromosome maintenance 3 (Mcm3), Origin recognition complex subunit 3 (Orc3), Cellulose synthase-like protein D5 (Csld5), Cellulose synthase A catalytic subunit 8 (CesA8), Cellulose synthase A catalytic subunit 7 (CesA7), among others, clearly formed a central network (Fig. 5B) (Additional file 12: Table S7). Notably, these networks were exclusively for DEGs at 2 dpi, since DEGs corresponding to 10 dpi did not show a clear interaction (Additional file 11: Figure S5). In resume, the generation and visualization of relationships among DEGs at 2 dpi show significantly more interactions than expected. In the case of up-regulated genes, they are mainly associated to ribosome biogenesis and translation as well as in development, whereas down-regulated genes are involved in cell cycle, DNA replication and cell wall organization.
Differential gene expression of ribosome-related proteins at 2 dpi
The finding that mainly proteins involved in ribosome biogenesis and translation (structural constituents of ribosome) were the most significant DEGs at 2 dpi, encouraged to focus on these genes. As observed in Fig. 6, proteins related to ribosome biogenesis and translation were significantly up-regulated in 2 dpi compared to 10 dpi, 72 of which were exclusively expressed in the treatment at 2 dpi. These exclusive genes corresponded to ribosomal proteins, for which a significant increase in their expression pattern was observed only at 2 dpi (Fig. 6). As mentioned before, ribosomal proteins such as Ribosomal protein s12/s23 (Rps12/S23), Ribosomal protein l24B (Rpl24B), Ribosomal protein s6 (Rps6), Ribosomal protein s13 (Rps13), among others, were found up-regulated at 2 dpi. In summary, ribosomal-related proteins are found up-regulated at 2 dpi, suggesting that translation is impacted upon infection by Fov.