1. Summary of transcriptome sequencing, assembly and annotation
Transcription profiles of Caoyuan No.4 and Caoyuan No.2 with thrips treatment (R_T, S_T) and without thrips treatment (R_CK, S_CK) were explored using the Illumina HiSeq 2500 sequencing. Three independent biological replicates were used for each treatment, resulting in twelve samples. Following clean-up and quality filtering, between 89 016 922 and 102 382 586 clean reads were obtained for the 12 samples (Table S1). A transcriptome datebase containing 576 988 transcripts of average length 607.22 bp was obtained using Trinity software, with an N50 length of 883 bp and an E90N50 length of 1303 bp. Moreover, the sequence length of the assembled transcripts in 61% of sequences was < 500 bp in length with a minimum of 201 bp and a maximum of 15,749 bp (Table 2).
Table 2
Transcriptome assembly details
Type | Resource |
Total transcripts num | 314030 |
Total unigenes num | 214166 |
Total sequence base | 213260627 |
Largest | 15748 |
Smallest | 201 |
Average length | 679.11 |
N50 | 895 |
E90N50 | 1486 |
GC percent | 38.97 |
Mean mapped reads | 1337.922722 |
BUSCO score | 64.6%(3.1%) |
All unigenes and transcripts obtained by transcriptome assembly were aligned with six major databases (Nr, Swiss-prot, Pfam, COG, GO and KEGG databases). Of the 214,166 assembled unigenes, 183,951 of them were found to have homologs in the databases NR (97,224), Swiss-prot (77, 732), Pfam (70,878), COG (13,973), GO (69,142) and KEGG (57,356) (Fig. 1a). For the species distribution of the top BLAST hits in the NR database, 48,469 (50.15%) annotated transcripts matched the sequence of Medicago truncatula (Fig. 1b).
2. Identification And Analysis Of Differential Expression Genes (degs)
Principal component analysis (PCA) showed that there were clear differences between S_CK and R_CK, indicating that the differences between the samples were mainly derived from cultivar variation (Caoyuan No.4 and Caoyuan No.2), which explained 28% of the total variation. Specifically, the S_CK treatment group showed obvious differences with the S_T treatment group, while the R_CK treatment group clustered together with the R_T treatment group. The results indicated that Caoyuan No.2 may be more susceptible to thrips than Caoyuan No.4. Moreover, the three replicates of each sampling group clustered well together except for sample S_T3. Therefore, this sample was excluded from subsequent DEGs analysis (Fig. 2).
The DEGs among the four sampling groups were analyzed in the RNA-seq datasets. As shown in Fig. 3, a total of 5,655 significant DEGs were detected. Constitutive differential expression was shown in the control group (S_CK vs R_CK), with 851 contigs constitutively upregulated and 434 contigs downregulated in R_CK. After treatment with thrips, a total of 2,865 upregulated and 461 downregulated contigs were detected in S_T group (S_CK vs S_T), while only 11 and 2 upregulated and downregulated contigs were found in Caoyuan No.4 alfalfa (R_CK vs R_T). In addition, 2260 and 1064 upregulated and downregulated contigs were observed between the two groups treated with thrips (R_T vs S_T) (p-adjust value < 0.01 & |log2FC | >=1) (Fig. 3).
The DEGs were grouped into functional pathways by their assigned KEGG terms. Analysis of DEGs among the resistant (R_CK) and susceptible (S_CK) genotype showed that 188 upregulated and 109 downregulated genes enriched to 53 and 41 different pathways, respectively, under five major KEGG categories. These categories included organismal system, cellular process, environmental information processing, genetic information processing, and metabolism. Specifically, 29 upregulated DEGs were associated with energy metabolism, including ‘photosynthesis-antenna proteins (map00196)’, ‘carbon fixation(map00710)’, ‘nitrogen metabolism(map00910)’, ‘oxidative phosphorylation (map00190) and photosynthesis(map00195)’; 14 upregulated DEGs were involved in the metabolism of terpenoids and polyketides pathway (zeatin biosynthesis (map00908), sesquiterpenoid and triterpenoid biosynthesis (map00909), carotenoid biosynthesis (map00906), diterpenoid biosynthesis (map00904), and monoterpenoid biosynthesis (map00902)); 14 upregulated DEGs were involved in carbohydrate metabolism (fructose and mannose metabolism (map00051), galactose metabolism (map00052), pentose phosphate pathway (map00030), pyruvate metabolism (map00620), amino sugar and nucleotide sugar metabolism (map00520), glycolysis/gluconeogenesis (map00010)); 13 upregulated DEGs were associated with lipid metabolism, including ‘cutin, suberine and wax biosynthesis (map00073)’, ‘glycerophospholipid metabolism (map00564)’, ‘glycerolipid metabolism(map00561)’, ‘sphingolipid metabolism(map00600)’ and ‘steroid biosynthesis (map00100)’; 5 upregulated DEGs were enriched in the glycan biosynthesis and metabolism pathway; and 4 upregulated DEGs were enriched in phenylpropanoid biosynthesis. In addition, the upregulated genes that encode enzymes involved in ‘DNA replication’, ‘sesquiterpenoid and triterpenoid biosynthesis’, ‘mismatch repair’, ‘nucleotide excision repair’, ‘homologous recombination, ‘photosynthesis-antenna proteins’ and ‘cutin, suberine and wax biosynthesis’ were highly enriched in R_CK (p value < 0.01). However, the downregulated genes encoding CDPK, CNGCS, FLS2, RPM1 and EDS1 involved in plant-pathogen interaction was the most highly enriched pathway (Fig.S1), and 13 downregulated DEGs were enriched in the ‘folding, sorting and degradation’ pathway, including ubiquitin mediated proteolysis, RNA degradation and protein processing in endoplasmic reticulum (Fig. 4a,b, Table S2).
For the S_CK vs S_T group, 1,534 upregulated and 108 downregulated genes enriched to 117 and 56 different pathways, respectively. The upregulated genes enriched in all the different pathways, especially in isoflavonoid biosynthesis (map00943), proteasome (map03050), oxidative phosphorylation, amino sugar and nucleotide sugar metabolism, flavonoid biosynthesis (map00941). In addition, the result showed that 256, 210, 94 and 23 upregulated genes were involved in carbohydrate metabolism pathway, translation, energy metabolism (carbon fixation in photosynthetic organisms, nitrogen metabolism, oxidative phosphorylation, sulfur metabolism) and plant-pathogen interaction (encoding CDPK, CaMCML, Pit6, RPM1, CERK1, SGT, HSP90, EDS1, WRKY, NHO1 and PR ) (Fig.S1), respectively. Moreover, we found the downregulated genes mainly enriched in the cutin, suberine and wax biosynthesis, brassinosteroid biosynthesis, stilbenoid, diarylheptanoid and gingerol biosynthesis, and monobactam biosynthesis (Fig. 4c, d, Table S3).
The result showed that 117 genes were shared in both the S_CK vs S_T group and the S_CK vs R_CK group, while the other 2087 DEGs were only observed in the S_CK vs R_CK group (Fig. 5a). We hypothesize that the shared genes in the S_CK vs S_T group might be the induced defensive genes, while the other 2087 DEGs might be the constitutive defensive genes to thrips in Caoyuan No.4. A heat map generated using the 117 shared DEGs was divided into 6 clusters (Fig. 5b, Table S4). For cluster 1, all the 27 DEGs in both S_CK and S_T were upregulated, while the genes of R_CK and R_T were downregulated. These genes were mainly enriched in stilbenoid, diarylheptanoid and gingerol biosynthesis. Cluster 2 contained 22 DEGs upregulated in S_CK, and related to purine metabolism, ubiquitin mediated proteolysis, and amino sugar and nucleotide sugar metabolism. Genes in cluster 3 mainly enriched in tyrosine metabolism, linoleic acid metabolism and alpha-linolenic acid metabolism, and showed lower expression in the S_CK group compared to other groups. For cluster 4, 19 genes were upregulated in the resistant cultivar (R_CK and R_T), but downregulated in the susceptible cultivar (S_CK and S_T). These genes were closely related to cutin, suberine and wax biosynthesis, sesquiterpenoid and triterpenoid biosynthesis, and steroid biosynthesis pathways. Genes in cluster 5 showed the opposite expression patterns to cluster 4. Specifically, 20 DEGs were upregulated in Caoyuan No.2 (S_CK and S_T), but downregulated in the Caoyuan No.4 cultivar (R_CK and R_T). These genes encoded cysteine-rich receptor-like protein kinase (4 genes), rust resistance kinase and WRKY transcription factor. Cluster 6 contained 6 DEGs that were upregulated in the resistant cultivar (R_CK and R_T), encoding nodule lectin. However, most DEGs were unnamed, and had an uncertain function in the heat map, which needs further study to understand their functions (Fig. 5b, TableS4). The other 2087 constitutive defensive genes in Caoyuan No.4 were mainly enriched in energy metabolism (41 genes), carbohydrate metabolism (35 genes), replication and repair (125 genes), translation (35 genes), environmental adaption (22 genes), and lipid metabolism (16 genes) (Fig. 5c, Table S5).
3. qRT-PCR validation of differential expression genes (DEGs)
To confirm the reproducibility and accuracy of different gene expression identified through the RNA-Seq, 11 genes related to MAPK signaling pathway-plant (PNY01608.1, Fig. 6a), defense response (CAO02958.1, Fig. 6b), xyloglucan metabolic process (XP_003625695.1, Fig. 6c), cellulose biosynthetic process (XP_003595051.1, Fig. 6d), pectin biosynthetic process (XP_003595107.1, Fig. 6e), lipid biosynthetic process (XP_003626834.1, XP_003606194.2, Fig. 6f, g), plant-type secondary cell wall biogenesis (XP_003625615.2, Fig. 6h), lignin catabolic process (XP_013448882.1, Fig. 6i) and integral component of membrane (XP_012575371.1 and XP_003602504.1, Fig. 6j, k) were selected for qRT-PCR validation and analyzed to determine the TPM (RNA-Seq) value. The results showed that the expression patterns of the 11 selected genes except XP_003595051.1 obtained through qRT-PCR were all in accordance with the RNA-Seq value (Fig. 6).