Comparative transcriptomes and genome-wide identi cation reveal salt stress-responsive PP2C in Jute (Corchorus capsularis)

Aminu Kurawa Ibrahim Fujian Agriculture and Forestry University, Fuzhou 350002, Fujian, China. Yi Xu Fujian Agriculture and Forestry University, Fuzhou 350002, Fujian, China. Qingyao He Fujian Agriculture and Forestry University, Fuzhou 350002, Fujian, China. Sylvain Niyitanga Fujian Agriculture and Forestry University, Fuzhou 350002, Fujian, China. Muhammad Zohaib Afzal Fujian Agriculture and Forestry University, Fuzhou 350002, Fujian, China. Lilan Zhang Fujian Agriculture and Forestry University, Fuzhou 350002, Fujian, China. Jianmin Qi Fujian Agriculture and Forestry University, Fuzhou 350002, Fujian, China. Liwu Zhang (  zhang_liwu@hotmail.com ) Fujian Agriculture and Forestry University, Fuzhou 350002, Fujian, China.


Introduction
Corchorus olitorius L and Corchorus capsularis L are the essential ber crops cultivated in the world [1]. The demand for these crops increases globally due to their eco-friendly and broad-spectrum application characteristics [2]. However, their production has been affected by salt stress, which is a major environmental factor that hinders plant growth and development and affects overall yield [3]. It can exert both primary (shift in ion contents or osmotic dynamics, leading to ion toxicity) and secondary (includes oxidative stress, damage to cellular components, and metabolic dysfunction) effects on the plant. As such, the affected plants activate a ow of signal transduction pathways that alter gene expression, membrane tra cking, hormonal levels, and protein phosphorylation [4]. A wide range of physiological, biochemical, and molecular processes in plants are affected by salt stress [5]. Salt stress also regulates the levels of Reactive Oxygen Species (ROS) as well as changes in the expressions of genes, particularly those of the Transcription Factors (TFs) [6]. When salt-tolerant and sensitive genotypes are compared, plant hormonal signal transduction and arginine and proline metabolisms are the most signi cant pathways to be considered. The phytohormone abscisic acid (ABA) is among the salt-stress-induced compound central to salt stress responses [3]. Firstly, the plasma membrane nicotinamide adenine dinucleotide phosphate oxidase (NADPH oxidase) generates Hydrogen peroxide (H 2 O 2 ) to modulate calcium signals that affect the downstream ABA responses [7]. ABA binds to Pyrabactin resistance 1-like (PYR/PYL) receptors and 2C protein phosphatase (PP2C proteins), forming a PYL-ABA-PP2C complex that activates Serine/threonineprotein kinase SRK2 (SnRK2s) [8]. Additionally, antioxidant production is regulated through a group of protein kinases phosphorylates bZIP and other TFs [9]. Moreover, mitogen-activated protein kinase (MAPK) pathways are activated through ABA, directly phosphorylating many other ABA effector proteins [7].
A balance between the multiple pathways is always needed for any plant developmental process, which can easily be compromised under abiotic stress in uences [10]. Production and balance of ROS are related to photosynthesis, phenylalanine metabolism, and peroxidase pathways [11]. The photosynthetic pathway's electrons leak to O 2 underlying stress, resulting in a massive generation of ROS. However, numerous effector genes are involved in peroxidase pathways such as superoxide dismutase (SOD), ferulic acid [12], catalase (CAT) [10], and isocitrate dehydrogenase (ICDH) present differential expression. If the pathways can preserve ROS at a relatively low level and reconstruct the new balance, the tissues and cells may be prevented from damage or even death due to oxidation. Peng et al. [13] investigated salt-tolerant and sensitive cotton genotypes by transcriptomic analysis and reported up-regulated genes related to abscisic acid, transporter, ethylene, as well as membrane receptors signal transduction [13]. Improving salt tolerance in jute will enhance its cultivation in saline environments. However, many previous studies concentrated on morphological, physiological, and proteomic analyses, ignoring transcriptomic approaches [7].
PP2C modulate stress-signaling pathways and reverse the stress-induced PK cascades to complex environmental stimuli [14]. In higher plants, PP2Cs negatively regulates the ABA signaling pathway and decrease tolerance against oxidative stress [15]. Proteins encoded by PP2Cs candidate genes play a vital role in various abiotic stress signaling such as Salt, drought, and freezing [16]. PP2C family has been identi ed and characterized in Arabidopsis, rice maize, and cotton among species as such developed complex molecular mechanisms to implement and survive during adverse growth conditions [17,18]. These studies have established the diverse role of PP2C genes in plant development and environmental stresses. However, not conducted in Corchorus capsularis despite its' utilization and productivity have been affected due to salt stress. Therefore, this study is crucially needed and urgent. However, genome data availability allows us to conduct this study. Twenty-four transcriptome sequencing from the root and leaf tissues of salt-tolerant and sensitive jute germplasms was used to identify potential salt-stress responsive genes and clarify regulatory pathways involved in salt tolerance jute. Moreover, comprehensive analysis of Corchorus capsularis PP2C regards; the phylogenetic relationship between Corchorus capsularis and Arabidopsis PP2C, conserved motifs, gene structure, the expression pro le of the key stress marker genes (RAB18, RD29B, KIN1, and RD29A), as well as chromosome mapping of Corchorus capsularis PP2C genes were also investigated.

Results
Transcriptome assembly and sequencing Transcriptome analysis of the twenty-four Salt tolerant (J194) and sensitive (J7) samples using Illumina paired-ends sequencing technology was used to explore the DEGs related to NaCl stress in jute (utilizing two biological replicates in each case; tissue (leaf and roots) and time (0, 6 and 12 hours) (Additional Table  S3). The comparison of J194 and J7 due to salt exposure is also presented in Figure (1a-b). The 1518813542 raw reads were generated, 759406771 clean reads were obtained, and the sequence was mapped to the jute genome (Additional Table S3). The GC contents range between 44.24-45.59%, and the Q30 scores are greater than 90% (Additional Table S3 and S4). The clean reads were used to assemble the 27720 transcripts (Additional Table S3) to be used as a reference sequence for downstream analysis and 26463 unigenes (Additional Table S4) 7272 unigenes. Comparing the clean reads with the reference genome when assigned into exon, intron, and intergenic regions indicated that the exon-region sequence accounts for more than 80% of the genome-mapped sequences as expected (Additional Figure S1). These results present a higher degree of annotation accuracy. Additionally, the box plot comparison of Fragments Per Kilobase of transcript per Million mapped reads (FPK) indicates that sequence results were reliable; the sample yielded equivalent reads and coverage depths between duplicates (Fig. 1b).
To comprehend the DEGs annotated unigenes functions, 26463 unigenes were annotated in all eight databases. It was observed that most of the unigenes (26413) were aligned with that of the NR database, followed by eggNOG_Annotation (22539) and Pfam_Annotation (19493). However, annotation in the KEGG database (8992 unigenes) presented the least number of unigenes (Fig. 1c). DEGs annotations across the treated samples (Additional Table S5) indicated that (J7R_0H Vs. J7R_6H) had the highest number of annotated unigenes across all the databases, followed by (J7L_0H Vs. J7L_6H), whereas (J194L_0H Vs. J194L_6H) had the least number across all the databases (Additional Table S5). It was observed that the most common enrichment analysis was related to plant hormone signal transduction pathways (Additional Table S6) as vital for jute response to NaCl exposure.
Identi cation of differentially expressed genes in response to salt treatment Analysis of DEGs in J7 and J194 on different tissues (roots and leaves) under control and salt stress conditions. Indicated that 26463 unigenes were differentially expressed (Additional Table S7). Cluster analysis revealed that differential gene expression pro les were similar within the same tissue types and time exposure rather than between different genotypes and time duration (Fig. 2a). The dispersion of these differentially expressed genes is presented in (Additional Figure S2).
Cluster analysis revealed that differential gene expressions were more similar within the same tissue types and time exposure than between different genotypes and time duration (Fig. 2a). We observed that genes within the same tissues under the exact period of exposure to NaCl were clustered together. Additionally, replicating the treated and control samples was highly consistent, signifying the RNA-seq results' reproducibility.
The results showed that plant hormone signal transduction (ko04075) was the most enriched pathway in root tissues at six-hour exposures to NaCl stress. Plant hormones such as auxins, jasmonic acid, and abscisic acid play signi cant roles in the plant response to stresses by regulating hormone signal transduction pathways [19,20]. Among the 62 enriched DEGs observed in both J194 and J7 root tissues at six hours of exposure to NaCl Stress (Additional Table S8), 17 had unknown functions, while the remaining 45 were generally related to abiotic stress responses.
Plant hormonal signal transduction pathways of the up-and down-regulated DEGs in the two germplasms' root tissues (J194 and J7) at six-hour of exposure to NaCl stress are presented ( Fig. 2c and Additional Figure S3). The result indicated that three DEGs were involved for AUXIN1, including Corchorus_capsularis_newGene_739, Cc.01G0031930, and Cc.04G0019670. These genes were downregulated in J194 and J7, respectively. For IAA, ve DEGs were identi ed; however, only Cc.07G0002340 (down-regulated) and Cc.07G0003430 (up-regulated) were recorded in J194. Additionally, two DEGs (Cc.01G0010780 and Cc.07G0003430) and three DEGs (Cc.04G0043620, Cc.04G0045370 and Cc.07G0002340) were up-and down-regulated in J7, respectively (Fig. 2c). For ARF, three DEGs were involved, but only one (Cc.06G0025590) was down-regulated in both J194 and J7. another one (Cc.06G0027580) was down-regulated in J7 only. For IAA, one DEG (Cc.07G0005110) was up-regulated in both J194 and J7, While for the GH3 signaling pathway, two up-regulated DEGs were found (Cc.01G0028710 and Cc.02G0006030) in both J194 and J7. SMALL AUXIN UP RNAs (SAUR) involves eight DEGs, among which four (Cc.03G0019440, Cc.03G0029630, Cc.04G0007220 and Cc.05G0005100) were up-regulated in both J194 and J7. However, the remaining four (Cc.02G0011530, Cc.02G0022670, Cc.04G0014260, and Cc.06G0027410) were up-regulated only in J7.
For PYL, one DEG (Cc.03G0016680) was only up-regulated in J7 whereas six DEGs were involved in the PP2C pathway, out of which three (Cc.03G0000600, Cc.03G0030800, and Cc.07G0001880) were upregulated in both J194 and J7, Cc.03G0016550, and Cc.07G0028160 were up-regulated in J194 while Cc.06G0030850) was only up-regulated in J7. SnRK2 signaling pathway had only three DEGs, including Cc.02G0003620 and Cc.04G0017920 that were up-regulated, and Cc.02G0021190, which was downregulated in J7. For ABA-responsive element binding factor (ABF), three DEGs were involved; one (Cc.04G0004780) was up-regulated in both J194 and J7, whereas Cc.01G0035870 and Cc.06G0010680 were respectively up-and down-regulated speci cally in J7. Two DEGs were involved in JASMONATE ZIM DOMAIN (JAZ) pathway; however, only one (Cc.06G0030170) was up-regulated in both J194 and J7. In the MYC2 pathway, only one gene (Cc.04G0013900) was down-regulated in both J194 and J7.
Quantitative reverse transcription-PCR to validate the RNA-seq result is presented in (Additional Figure S4). The result indicated that all the study genes con rmed theirs expression in both the qRT -PCR and RNAseq, except that sample (J7R_12H Vs. CR) for ABF/bzip genes were down-regulated in the qRT -PCR rather than up-regulated.
Correlation analysis between germination rate related traits with seedling stage parameters and qRT-PCR with RNA-seq results are presented in supplementary (Additional Figure S5 and S6) respectively. The results indicated a signi cant positive correlation between germination related traits (RGS and RGVS) and seedling stage parameters exposed to NaCl stress, except DRW, RL, RFW, and DSW. Moreover, a signi cant positive correlation was also observed between qRT-PCR and RNA-seq results (Additional Figure S6).
The relative expression levels of seven randomly selected candidate genes in J194 and J7 at one and two weeks duration of exposure to NaCl stress are presented in (Additional Figure S7a-b) validating the candidate genes. The gene relative expression levels of the J194 and J7 at one week is shown in (Additional Figure S7a and c). The relative expression of the genes in leaf and root tissues of J194 and J7, were higher in J194 compared to J7, though the expression of Cc.06G0024090 (MYB) was higher in the control condition than that of the treated samples of both J194 and J7. The gene relative expression levels of the J194 and J7 at two weeks is presented in (Additional Figure S7b and d). The result indicated that the expression levels were slightly higher than that of one week. Still, the genes expressed high in treated J194 root and leaf tissue samples compared to that of J7; however, Cc.06G0024090 (MYB) of the control leaf sample had the highest expression compare to the treated samples. It was observed that the genes expressed more elevated in the leaf than in the root.

Identi cation of Corchorus capsularis PP2C gene family
The Corchorus capsularis PP2C gene family and their homologs in other species are presented in (Table 1); the table indicated 89.8-100% relative similarities and most of the homologous organisms were Theobroma cacao (15 genes), Herrania umbratica (8 genes), and Duriozibethinus (5 genes). Moreover, the identi ed PP2C Arabidopsis and Corchorus capsularis conserved domains' after removing redundancies sequences, and non-PP2C using SMART and PFAM (Additional Table S10). About 78 Arabidopsis PP2C were identi ed, which used as a queries sequences to search against the Corchorus capsularis genome using NCBI local blast as stated in the materials and methods, after con rming the presence of PP2C catalytic domains using SMART and PFAM, about 38 genes were identi ed (Additional Table S10).
Group O Cc.04G0042080 had 100% bootstrap support with AT5G19280. The distance tree was inferred using the neighbor-joining methods based upon an alignment of full-length amino-acid sequences of the PP2C conserved domains. They were grouped according to the PP2C Arabidopsis genes previously reported.
For identi cation of motifs and gene structure, 11 Corchorus capsularis PP2C motifs were identi ed; their sequences and motifs distributions are presented in (Fig. 4a). The results indicated that motifs 3, 2,5,11, 4, and 1 were wide spreads among the Corchorus capsularis PP2C groups. However, other motifs such as 9 and 6 were speci c to two groups (D and C), whereas 10 and 7 were exclusively speci c to group D. Moreover, the 8 motif was speci c to group C and E.
The phylogenetic tree was constructed from alignments of 38 amino acid sequences of PP2C conserved domains in Corchorus capsularis PP2C; it was grouped according to PP2C Arabidopsis genes previously reported. The Sequences and the length of the conserved motifs in the amino acid sequences of Corchorus capsularis PP2C genes are also indicated at the bottom.
The relative gene expression level of Cc.04G002237(RAB18), Cc.03G0029910(RD29B), Cc.01G0008270(KIN1), and Cc.03G0029910(RD29A) at 0, 6, and 12 hours of exposure to NaCl are presented in (Figs. 5a-d). The result indicated salt-tolerant germplasm (J194) had the highest relative gene expression level across the periods. The result further revealed that J194 root tissues at six-hour exposure to NaCl recorded the highest gene expression levels in all the studied genes.
Moreover, these genes' relative expression levels at one and two-week exposure to NaCl are presented (Figs. 6a-d). The results still indicated the J194 had the highest relative gene expression level in all studied genes across the periods of exposure to NaCl. However, the relative expression level of Cc.04G002237 (RAB18) and Cc.03G0029910 (RD29B) were higher in J194 roots tissues across the period of NaCl exposure whereas, Cc.01G0008270 (KIN1) and Cc.03G0029910 (RD29A) had the highest expression level in J194 leaf tissues at all duration of exposure to NaCl.
The qRT-PCR analysis was conducted to generate the relative expression pro le level of key established stress marker genes such as RAB18, RD29B, and KIN, and RD29A (a to d respectively) in J194 and J7 leaf (L)and root (R) tissues at CT(0), 6, and 12hours as indicated in X-axis and the relative expression level is indicated in Y-axis. Data from the mean of the replicated samples are presented as columns and an error bar denotes the standard deviation. *p-value < 0.05 and **p-value-< 0.01 indicate statistically signi cant level.
The qPCR analysis was conducted to generate the relative expression pro le level of key established stress marker genes such as RAB18, RD29B, and KIN, and RD29A (a to d, respectively) in J194 and J7 leaf (L)and root (R) tissues at 0(control), one, and two weeks duration of the exposure to NaCl. The X-axis represents the samples at control (C) and Treated (T) and the relative expression level is indicated in Y-axis. Data from the mean of the replicated samples are presented as columns and an error bar denotes the standard deviation. *p-value < 0.05 and **p-value-< 0.01 indicate statistically signi cant level.

Discussions
Identi cation of genes and their functions can be achieved by clustering them into the same or similar expression patterns using hierarchical clustering analysis based on their expression levels determined from FPKM values. In general, those genes' metabolic patterns with the same/similar functions or functioning in the same pathways are clustered together. In this study, we observed that genes from the same tissues exposed to NaCl at the same duration clustered together (Fig. 2a). DEGs within a single cluster appeared as co-expressed genes. In contrast, color-coding of different cluster groupings signi es genes with similar expression patterns that share the same functions, thus participate in the same biological processes.
Furthermore, we observed that for up-and down-regulated DEGs in leaf tissues, J7 at six and twelve-hour of exposure to NaCl as well as theirs combinations had the highest number of DEGs(Additional Figure S2(i)). Additionally, J194, at six and twelve hour's exposure to NaCl and their pairing, had the least number of DEGs. The number of DEGs across the two germplasms (J7 and J194) and the duration of exposure to NaCl in root tissue indicated that J7 at six and twelve-hour of exposure to NaCl as well as theirs combinations had the highest number of DEGs (Additional Figure S2(ii)), the numbers are much higher at six hours than at 12hours. Additionally, still J194 at six and twelve hours of exposure to NaCl as well as theirs, combinations had the least number of DEGs. The results are consistent with several previous studies [22][23][24]. Indicating that salt-sensitive germplasms display the highest number of DEGs than tolerant germplasms, and 6 hours of exposure to NaCl recorded the highest number of DEGs than 12h.
Moreover, this is because the effects of salt stress on the salt-sensitive germplasm was greater than that of tolerant germplasm. Interestingly, we recorded a higher number of DEGs after pairing the sensitive germplasms (J7) at 6 and 12 hours combination than pairing the tolerant germplasms (J194) (Figures S2(I-II). From the results, we observed that J194 (salt-tolerant) had fewer DEGs than J7 (salt-sensitive) (Additional Figures S2 (I-II). These ndings agree with those reported from previous studies [22][23][24]. Moreover, the roots of J194 and J7 had more DEGs than the leaves that indicates there are other genes associated with roots that do not necessarily play roles in plant response to salt stresses.
The ABA signaling pathway comprises many regulated genes and plays a role in plant response to stress. It also mediates the mechanism for adaption to environmental changes [6]. Usually, SnRK2 positively regulates ABA signaling, critical in abiotic stress responses, especially NaCl stress. A high level of ABA during the stages of seedling growth is maintained, from which the plants are protected from damage caused by salt stress [25]. The plant hormone auxin plays a vital role in regulating plant developmental processes [26]. In the present study, most of the auxin-related genes observed in the root tissues exhibited similar expression patterns in J7 and J194 samples at 6 hours post-exposure to NaCl, which indicates their signi cance in plant abiotic stress responses and conserved mechanism of action. Auxin carriers such as the auxin in ux carrier 1 mediate auxin's polar transport (AUX1) [27]. It was observed that hormone signaling pathways were explicitly up-regulated in J194 and J7, re ecting their signi cant plant stress tolerance roles. Plant hormone signal transduction pathways were the most enriched. About seven identi ed novel DEGs, one AUXIN1 DEG (Corchorus_capsularis_newGene_739) was down-regulated in both the salt-tolerant and sensitive germplasms in root tissues at 6 hours post-exposure to NaCl.
For the PYL-ABA-PP2C pathway, we recorded seventeen DEGs (Cc.03G0016680, Cc.03G0000600, Cc.03G0016550, Cc.03G0027770, Cc.03G0030800, Cc.06G0030850, Cc.07G0001880, Cc.07G0028160, Cc.07G0031700, Cc.02G0003620, Cc.02G0021190, Cc.03G0023450, Cc.04G0017920, Cc.07G0021650, Cc.01G0015540, Cc.01G0035870 and Cc.04G0004780) in the salt-stressed root tissues of both J194 and J7 which is consistent with some previous studies [24,28]. Interestingly, our results showed that the PYL gene (Cc.03G0016680) was up-regulated, supporting the basic ABA signaling model but contradicts the ndings [24]. Under normal circumstances, PYL binds to ABA and PP2Cs to form PYL-ABA-PP2C complexes, thereby inhibiting PP2Cs. This inhibition releases autophosphorylating SnRK2s, which then phosphorylate many downstream effectors [3]. As such, PYL expression is anticipated to keep pace with ABA concentration and be up-regulated under stress conditions. Additionally, Our KEGG analysis con rmed the enrichment of plant hormone signal transduction (Additional Table S9 and Fig. 2c) in the root tissues at 6 hours of exposure to NaCl. By regulating hormone signal transduction pathways, plant hormones such as auxins, jasmonic acid, and abscisic acid play a signi cant role in plant response to abiotic stresses [19,20]. The importance of ABA in plant responses to these stresses is conserved. Interestingly, from the results, plant hormone signal transduction pathway (Fig. 2c), three Corchorus capsularis PP2C genes (Cc.06G0030850, Cc.03G0016550, and Cc.07G0028160) were found to be up-regulated in J194 root at 6hours exposure to NaCl, moreover have 57, 100 and 99% bootstrap support with Arabidopsis PP2C (AT1G17550, AT1G72770, AT4G26080, and AT5G57050), AT5G53140, and (AT1G07630 and AT2G28890) genes respectively. Also, the chromosomal location of such genes (Additional Figure S8) indicated their involvement in segmental duplication. As such, these genes could be serves as a candidature for salt tolerance in Corchorus capsularis.
Moreover, phytohormone abscisic acid plays vital regulatory roles in salt, drought, and cold stresses during the plant developmental stages [29,30]. As such, NaCl stress was assessed by monitoring the expression pattern of the key stress marker genes (RAB18, RD29B, KIN1, and RD29A) in J194 and J7 under control and stress conditions. Our results indicated that the expression of these stress marker genes in salt-tolerant individuals (J194) was higher than the sensitive individual (J7) at six hours of exposure to NaCl in root tissues. Our ndings were consistent with those reported by Singh et al. [31], which signi es that the overexpression pro les of Corchorus capsularis PP2C (Cc.04G002237, Cc.03G0029910, Cc.01G0008270, and Cc.03G0029910) genes revealed a signi cant interplay of ABA-dependent and independent pathways for abiotic stress. However, at one and two weeks of exposure to NaCl, still indicated J194 had the highest gene expression levels. Yet, the expression pro le of these stress marker genes reveals that they were tissue-speci c. Indicated that RAB18 and RD29B were highly expressed in root tissues, whereas KIN1 and RD29A expressed highest in leaf.
The analysis of gene structure and conserved motifs based upon the phylogenetic relationship was conducted to get insight into the structural feature relationship of the Corchorus capsularis PP2C. Eleven number of Motifs were identi ed, which is consistent with those reported by Cao et al. [14,21] and different from the one stated in cotton [32]. Must of the PP2C family members contained 3, 2, 5,11,1, and 4 motifs except for group D that had some speci c motifs (7, 10, 9, and 6). The results revealed that all the identi ed PP2C genes have some evolutionary relationship; thus, proteins classi ed into the same subgroup shared the same sequence. For Corchorus capsularis gene structure PP2C, the number of exon-intron ranges between 3-21, and 2-20, respectively that contradicted the ndings of Cao et al. [14,21,32]. These results revealed that PP2C genes in the same subgroup show more or less similar exon-intron organization. The work further revealed that the differences in motifs distribution in respective protein sequences provide divergence evidence of gene functions in different sub-families. Moreover, multiple alignments revealed that not all Corcharus capsularis PP2C domains contain all the conserved motifs due to the partial deletion in the c-terminal of the PP2C phosphate catalytic domains. As such, resulting in a few motifs and loss of functions, similar ndings were reported [14]. However, proteins in the same subfamily exhibit equal motif distribution to support their close evolutionary relationship. Surprisingly, indicated in (Figs. 3 and 4b) the Corchorus capsularis PP2C proteins remain in the same subfamily due to the pattern of the Corchorus capsularis PP2C conserved domains exon-intron gene structure were examined according to their phylogenetic relationship. Moreover, the protein domains and exon-intron structural diversity pattern play a vital role in gene family evolution; these ndings were consistent with those reported by Cao et al. [14,32].
Small-scale tandem and large segmental duplications are the two central mechanisms contributing to the plant kingdom's genome complexities [33]. Xue et al. [21], reported that PP2C expanded through chromosomal duplication and the whole genome. Moreover, as mentioned by Cheung et al. [34], closely related genes located within a distance of fewer than 200 kb on the same chromosome are referred to as tandem duplications; otherwise, segmental duplications. Two pair's paralogous Corchorus capsularis PP2C genes on chromosome Five (Cc.05G0029950 and Cc.05G0029870) and Four (Cc.04G0047530 and Cc.04G0048240) were found to be involved in tandem duplication events others were segmental duplication. As such, most of the PP2C Corcharus capsularis genes were involved in segmental duplication, and this is consistent with the ndings of [14,32]. This duplication type plays a vital role in expanding the PP2C Corchorus capsularis gene family. It might have experienced functional discrepancies and probably may lose its unique functions, rather than developed novel functions [35].

Source of plant materials
About 292 jute germplasms were used for the screening of tolerant and sensitive individuals at the germination stage. The preliminary germination experiment was set-up to select suitable NaCl concentration for the study. Different NaCl concentrations (40mM, 120mM, and 200mM) and the Control (Deionized water) were tested on the parent strains. Petri dishes of about (12 x 12 x 5) cm with layers of lter paper were used to germinate 50 seeds from each parent. 18mls of the prepared treatment solutions (NaCl dissolved in deionized water) and control (deionized water) was poured into the Petri dishes, and the seeds were subsequently spread accordingly. A Completely Randomized Design was used and repeated three times. Petri dishes were kept in a growing chamber and maintained at 28 0 C during the day and 22 0 C at night on 16-hour light per 8-hour dark cycle for six days. Germinated seeds were recorded as germination percentage (number of germinated seed/total number of seeds sowed) x100. Based upon the mean germination percentages of the two parents generated (result not shown), suitable concentrations of NaCl were identi ed as 120mM NaCl (result not shown), which was used to screen out the 300 germplasms in response to NaCl concentration, following the same experimental protocol. Two germplasms, salt-tolerant (J194) and salt-sensitive (J7) were selected for transcriptomic analysis. They were further subjected to different salt concentrations (200, 250, and 300mM) at the seedling stage to obtain a suitable concentration for the experiment using Pindrustup media. The salt solutions were applied at two to the three-leaf stage for two weeks. Data were taken and analyzed, based upon which 250mM concentration was selected (result not shown).
Salt treatment and sample preparations J194 and J7 were grown on a separate pot lled with Pindrustup media under normal conditions. At two to three leaves stage, seedlings from each sample (J194 and J7) were rinsed with distilled water and immediately placed into the tubes lled with 250mM NaCl solution and control (distilled water (CK)). Root and leaf tissues from each germplasm were harvested, frozen immediately in liquid nitrogen, and then stored at -80 o C for RNA extraction. Two biological replicates of each sample were used for RNA extraction.

RNA-extraction and isolation
Total RNA was isolated from the leaf and root tissue samples using Trizol (Invitrogen, Santa Clara, CA, USA), following the manufacturer's protocol. RNA degradation and contamination were checked in 1% agarose gels. RNA purity and concentration were checked using NanoPhotometer® spectrophotometer (Implen, West Lake Village, CA, USA) and Qubit® RNA Assay Kit in Qubit® 2.0 Fluorometer (Life Technologies, Carlsbad, CA, USA), respectively. The RNA integrity was assessed using the RNA Nano 6000 Assay Kit of the Agilent Bioanalyzer 2100 system (Agilent Technologies, Palo Alto, CA, USA). Twenty-four samples were used for the transcriptome sequencing of J194 and J7 roots and leaf tissues at O, 6, and 12Hours of exposure to NaCl (J194L_0H, J194R_0H, J7L_0H, J7R_0H, J194L_6H, J194R_6H, J194L_12H, J194R_12H, J7L_6H, J7R_6H, J7L_12H, J7R_12H). For each sample, two biological replicates were taken makes it twenty-four as indicated above.

Correlation
Since the selection of salt-tolerant and sensitive germplasms was made base on germination rate, a study of the relationship between germination rates related traits with seedling stage parameters and qRT-PCR with RNA-Seq results were carried out using R-package software [36].

Transcriptome sequencing
A total amount of 1μg RNA per sample was used as input material for the RNA sample preparations. Twenty-four RNA sequencing libraries were generated using NEB Next Ultra TM RNA Library Prep Kit for Illumina (NEB, USA) following the manufacturer's recommendations, and index codes were added to attribute sequences to each sample [37]. The quality of the library was assessed using the Agilent Bioanalyzer 2100 system. The clustering of the index-coded samples was performed on a cBot Cluster Generation System using TruSeq PE Cluster Kit v4-cBot-HS (Illumina) according to the manufacturer's instructions. After cluster generation, the library preparations were sequenced on an Illumina platform, and paired-end reads were generated for transcriptome sequencing. All sequencing data were deposited into the NCBI database under the Sequence Read Achieve (SRA) submission number SUB8495477.

Transcriptome analysis and annotation
Low-quality reads and those containing adapter were removed using in-house Perl scripts. High-quality clean data were used for all subsequent analyses. Transcriptome assembly for cleaned data was performed in Trinity with default parameters [38]. All clean reads were then mapped to the transcripts; those with less than 5X coverage were removed. About eight public databases were used for gene function annotations such as; KO (KeggOrthology), KOG (Eukaryotic Orthologous Groups), Pfam (Homologous protein family), NR (non-redundant database; NCBI, GO (Gene Ontology) and SwissProt (A manually annotated and reviewed protein sequence database) using BLASTx with an E value threshold of 10 -5 [39], COG (Clusters of Orthologous Groups of proteins) and eggnog.

Biological analysis of differentially expressed genes
A Gene expression level of each sample was estimated based on RSEM [40]. Clean data reads were then mapped back onto the assembled transcriptome, and thus, a read count per gene was later obtained from the mapping results. DESeq R package (1.10.1) was used to obtain Differentially Expressed Genes (DEGs) of the control and treatment groups [41]. Benjamini and Hochberg's approach was used to adjust the Pvalue for controlling the false discovery rate. The adjusted P-value < 0.01 found by DEseq in any genes were assigned as differentially expressed. Gene Ontology (GO) enrichment analysis of the DEGs was implemented by the GOseq R packages based on Wallenius non-central hypergeometric distribution [42]. We used KOBAS software to test the statistical enrichment of differentially expressed genes in KEGG pathways.

Quantitative reverse transcription-PCR Analysis
The gene expression from eight randomly selected DEGs was analyzed using a two-step quantitative reverse transcription-PCR (qRT-PCR) to validate the RNA-seq results. Two independent biological and three technical replicates were performed. First, one µg total RNA per sample was reverse-transcribed into rststrand cDNA using the GoScript TM Reverse Transcriptase kit (Promega Corporation 2800 Woods Hollow Road Madison, WI 53711-5399 USA), following manufacturer protocol. After 10 x dilution, cDNA was used as templates for qRT-PCR (Bio-Rad CFX96 Real-Time System C1000 Touch Thermal Cycle USA). The reaction mixture was prepared using the Fast Start Universal SYBR Green Master (ROX) kit (ROCHE) following manufacturer protocol. The jute ELF (elongation factor 1 alpha) gene was selected as the endogenous control [43]. Primers for the DEGs and ELF are listed in Supplementary File (Additional Table  S1). Relative expression levels were determined using the comparative Ct method [44].

Candidate genes validation
To validate the salt-tolerant candidate genes, seven genes that were used to validated the RNA-seq above were randomly selected and used in the germplasms (J194 and J7) roots and leaves samples exposed to NaCl stress at one and two weeks. The salt-tolerant and sensitive germplasms (J194 and J7), were grown on a separate pot lled with Pindrustup media, which were subjected to control and treated (NaCl) conditions. The growing plants were subjected to reasonable conditions and watered; at two leaves stage (two weeks), 50mL of the salt solutions were applied one-day interval to the intended treated pots for another two weeks. At one and two weeks (Additional Figure S7c-d) of the treatment application, leaves, and roots, including the shoots from each sample (J194 and J7) under control and salt-treated conditions, were harvested, frozen immediately in liquid nitrogen, and then stored at -80 o C for RNA extraction. Three biological replicates of each sample were used for RNA extraction.
Identi cation of PP2C genes in Corchorus capsularis and Arabidopsis thaliana All Arabidopsis PP2C gene families were initially search using the keyword "protein phosphatase 2C" in the NCBI (http://www.ncbi.nlm.nih.gov) and TAIR (http://www.Arabidopsis.org) databases. The protein sequences obtained were then pooled, and the redundant were removed using a custom Perl program. The Remaining protein sequences were used as a query to perform multiple database searches against genome and proteome les downloaded from TAIR. All redundant sequences were further deleted; the sequences obtained were also veri ed using the NCBI-Conserved Domain database (https://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi) search program and SMART databases (http://smart.embl-heidelberg.de/) [45]. Proteins without PP2C Domains' and those with obvious error on their gene length or less than 100bp were removed. The remaining Arabidopsis protein sequences with full PP2C domains were used as a query. We performed a local blast search against the Corchorus capsularis reference genome; those with the highest percentage identity and e-value ≤1e-003 were selected for further analysis. PP2C Conserved domains were also conducted as stated above, and all redundant sequences were removed accordingly.
Alignment and phylogenetic analysis of PP2C sequences Amino acid sequences of Arabidopsis and Corchorus capsularis PP2C proteins were further investigated, and multiple sequence alignment was conducted by MUSCLE [46] using MEGA7 software with the default options [47]. Maximum likelihood was used while constructing the phylogenic tree using the full likelihood method; 1000 bootstrap replication values were performed; all other parameters were set as default.
qRT-PCR analysis of PP2C (reference marker) genes expression in J194 and J7 Expression analysis of salt stress-responsive PP2C key genes such as; RAB18, RD29B, KIN1, and RD29A were searched and obtained in Arabidopsis TAIR (http://www.Arabidopsis.org). The obtained genes were selected according to the e-value and the identity for the next analysis. These genes were then used as query and performed local blast search against Corchorus capsularis reference genome from which the following genes according to their identity and e-value was selected for the expression analysis; Cc.04G0022370 (RAB18), Cc.03G0029910 (RD29B), Cc.01G0008270 (KIN1), and Cc.03G0029910 (RD29A).
Primers of these genes are presented in (Additional Table S2). The qRT-PCR for the J194 and J7 root and leaf samples (as illustrated above at 0, 6, and 12-hour of exposure to NaCl) were carried out. The samples were also taken for qRT-PCR analysis at one and two weeks of salt treatment in the potting experiment, as stated previously.

Declarations
Ethics approval and consent to participate Not applicable Consent for publication

Not applicable
Availability of data and materials The raw transcriptome data were deposited in the NCBI database under the Sequence Read Archive (SRA) submission number of SUB8495477.