Genome-wide analysis of the Saccharina japonica sulfotransferase genes and their transcriptional profiles during whole developmental periods


 Background As a unique sulfated polysaccharide, fucoidan is the major component of cell wall in brown seaweeds. Its biochemical properties are determined by the positions and quantity of sulfate groups. The sulfation process is catalyzed by sulfotransferases (STs), which transfer the sulfuryl groups to carbohydrate backbones and are crucial for fucoidan biosynthesis. Nevertheless, the structures and functions of STs in brown seaweeds are not completely understood.Results There are a total of 27 ST genes identified from our genome and transcriptome analysis of Saccharina japonica and they were located in the 12 scaffolds and seven contigs. The S. japonica ST genes have many introns and alternative splicing sites, and four tandem duplicated gene clusters were identified among this genes family. Generally, the ST genes could be classified into five groups (Group I~V) based on phylogenetic analysis, and the ST proteins, which were encoded by genes within the same group, contained similar conserved motifs. In group I sequences, we found two highly conserved regions (region I: TxPKSGTxW; region IV: RKGxxGDWKxxFT), and Lys 81 and Arg 287 are critical for catalysis and substrate binding. Members of the S. japonica ST gene family show various expression patterns in different tissues and developmental stages. Transcripttional profiles indicate that more than half of ST genes transcriptional levels in kelp basal blades are higher than in distal blades, and decrease with the kelp development stages, while only ST 5 , 9 , 12 , 18 and 25 are increased. The co-down-regulated pathways are related to oxidative phosphorylation, fatty acid biosynthesis and metabolic pathways, while the up-regulated pathways are involved with ribosome, nitrogen and sulfur metabolism.Conclusion Various characteristics of the STs allow the feasibilities of S. japonica to adapt to the costal environments, meet the needs of S. japonica growth and support their giant blades. This also reflects the complexity of fucoidan biosynthesis in different marine environments, tissues, and developmental stages.


Background
Saccharina japonica is an important commercial brown seaweed in Asia. It is rich in crude fibers and carbohydrates and is widely used as a raw material for the extraction of alginate and fucoidan. Moreover, S. japonica contains many bioactive substances that are valuable for cosmetics, foods and health [1]. Among all bioactive metabolites, fucoidan is considered highly valuable in the field of medicine.
Fucoidan, a sulfated polysaccharide from non-mammalian sources, is one of the most widely studied sulfated polysaccharides. It mainly exists in echinoderm and cell walls of brown algae [2]. Fucoidan was first discovered by Kylin in brown algae Laminaria digitata in 1913 [3]. The main component of the sulfated fucoidan was L-fucose-4-sulfate; galactose, mannose, xylose, glucose, arabinose, glucuronic acid and metal ions exist in small amounts [4,5]. It was believed that the content and structure of fucoidans in algae vary in different algae species, tissues, age, inhabitance and seasons [6,7]. For example, the fucoidan content in brown algae growing in intertidal and subtidal zone was 20% and 1-2%, respectively [8]. The complexity in fucoidan content and composition makes the structure-activity studies difficult.
Fucoidan functions as anticoagulant and antithrombotic substances, and exert immunomodulation, anti-inflammation and anti-tumor functions [9][10][11][12]. It is also effective in relieving diabetic nephropathy and denine-induced chronic kidney disease [13,14]. The structural parameters of fucoidan, such as the type of monosaccharide and fucose chain and the molecular weight of polysaccharide, all contribute to its bioactivity, especially the number and position of sulfate groups on the macromolecular skeleton [15][16][17]. For instance, the 2, 3-disulfated sugar residue is a common structure for anticoagulant activity [18,19], whereas, the existence of 2-O-sulfation at the C-2 position reduces the 4 anticoagulant activity of fucoidan [20]. Thus, sulfation plays an important role in the function of fucoidan. It has been reported that sulfotransferase (ST) can transfer the sulfuryl groups from the universal donor 3'-phosphoadenosine 5'-phosphosulfate (PAPS) to carbohydrate backbones [21,22]. It determines the position and quantity of sulfate groups in fucoidan.
The fucoidan biosynthesis pathway was not clear until the release of genome sequences of brown algae Ectocarpus siliculosus in 2010 [23]. Based on E. siliculosus genome sequencing and analogized with glycosaminoglycan (GAG) biosynthesis, Michel et al.
Previously, many genes involved in fucoidan biosynthesis were studied in S. japonica, Cladosiphon okamuranus and Nemacystus decipiens [24][25][26]. For example, Chi et al. (2017) worked on the MPI, MPG and PMM in S. japonica [27], and Nishitsuji et al. (2019) reported that FK and GFPP fused in N. decipiens genome [26]. Their researches enlighten our further genome-wide analysis on sulfotransferases (STs) in brown algae. Considering few investigation of this key enzyme in S. japonica, it therefore needs to explore the biosynthetic mechanism of sulfated fucoidan, especially sulfation modification of fucan 5 catalyzed by ST.
As described above, the structure and content of fucoidan varied with the kelp growth. Therefore, the transcriptional levels of ST genes might be fluctuated at different development stages and tissues. Considering ST gene family contains large amount of putative members, e.g. 21 in E. siliculosus and 9 in C. okamuranus [23,25], it is thus necessary to globally analyze the distinct features of these STs in brown algae.
Transcriptome sequencing is an effective tool to decipher the transcriptional levels of the total STs and the entire metabolic network. This could reflect how the specific STs function with the kelp growth at the mRNA level.

Identification of fucoidan biosynthetic genes
A total of 103 genes related to fucoidan synthesis were annotated based on the genome and transcriptome databases of S. japonica (Table S1). Generally, the ORFs of 81 genes were complete, and the rest of 22 genes were incomplete. The transcriptional levels of MPI3 (GENE_013986), PMM1 (GENE_007314), GM46D1 (GENE_026041) and ST12 (GENE_011842) were higher in each catalytic step, and were believed to be essential for fucoidan synthesis during the kelp growth and development. Amino acid sequence analysis showed that most proteins were non-secretory, and only five proteins (ST2, ST3, ST13, ST19, and ST25) have transmembrane helices, and were predicted to have signal peptide or signal anchor. In addition, five proteins without the transmembrane domain (ST1, ST4, ST6, ST7, and ST14) contained signal peptides. Only ST2 and ST16, were predicted to target the chloroplast, whereas ST3, ST7, ST10, ST17, ST19 and ST22, were predicted to be located in the mitochondria (Table S2).

Sequence analysis of the ST genes
The 27 S. japonica ST proteins could be classified into 5 groups (I -V) (Fig. 1A). A total of 20 conserved motifs were identified ( Fig. 1B and Table S3). Group I, II and V contained Sulfotransfer_1 domain (PF00685), group IV contained Sulfotransfer-2 domain (PF03567), and group III contained Gal-O-sulfotr domain (PF06990), which was only present in algae [28]. The ST sequences in the same group contain similar conserved motifs (Fig. 1A, B).
Gene structures analysis showed that most ST genes (26, %) had more than three introns.
Only one gene had less than three introns (ST17, 2 introns). The longest intron identified in the ST genes was nearly 15 kb (Fig. 1C).
We analyzed the types and numbers of all alternative splicing sites in S. japonicaST genes in different tissues and developmental stages. A total of 368 sites were identified in all samples. The most abundant alternative splicing site type was the ES type (127). Some types were centrally detected in several specific genes, for example, p5_splice (ST1), p3_splice (ST3 and ST22), ALS (ST11 and ST22) and AFS (ST10 and ST20). In addition, ST genes expressed in the basal blade contained more alternative splicing sites than those expressed in the distal blade. Details of these sites are listed in Table S4.
The PCR products of the cDNA sequences of three selected STs appeared as a single band, except for GENE_014314, as revealed by 1.5% agarose gel electrophoresis (Fig. S1). The results of these four selected STs in plasmid sequencing data of recombinant vector and RNA-Seq analysis were consistent. The coding sequences of the four STs are provided in Table S5.

Scaffold location and gene duplication of the STs
The 27 ST genes loci distributed randomly on 12 scaffolds and 7 contigs in S. japonica

Sequence alignment and phylogenetic analysis
As shown in Fig. 3, the secondary structure of S. japonica ST protein contained abundant alpha-helix (on average 56.38%) , followed by beta-bridge (on average 27.44%). Although there were some differences in amino acid sequences, the sequences in the same group displayed similar secondary structures. In group I, we found two highly conserved motifs (regions I (TxPKSGTxW) and IV (RKGxxGDWKxxFT)) and two conserved active sites (Lys 59 and Arg 276 ) (Fig. 4).
A phylogenetic tree was constructed based on the ST amino acid sequences: 27 S. japonica, 21 E. siliculosus, six C. okamuranus and nine P. tricornutum (Fig. 5). Five ST clades in our phylogenetic tree were determined based on the classification result of 15 E. siliculosus ST sequences [21]. Fifty-four of the 63 ST proteins were grouped into four subfamilies and the E. siliculosus STs previously clustered into two different clades (clade A and clade B) fell into one group, which contained 15 STs. In addition, seven, 13 and 19 STs clustered into clades C, D and E, respectively, and the remaining nine STs were unclassified.  Table 3. The STs exhibited several major expressed patterns: profile0 (ST8, ST20, ST21 and ST24), profile2 (ST6 and ST26), profile3 (ST1, ST10

Transcript profiles of STs in different tissues and developmental stages
and ST27), profile6 (ST15 and ST23), profile25 (ST14), profile28 (ST25) and profile29 (ST5, ST12 and ST18). Profile 0 and 29 are the two representative profiles, the former contains genes with a down-regulated expression pattern from January to June while the latter had 9 an up-regulated trend. The most enriched pathways in profile 0 included oxidative phosphorylation, photosynthesis -antenna proteins, photosynthesis, carbon fixation and metabolic pathway. Genes involved in ribosome, nitrogen metabolism, sulfur metabolism and inositol phosphate metabolism were enriched in profile 29 (Table S6).
We analyzed the transcriptional levels of the STs in distal blade, 1/3, 2/3 and basal blade of S. japonica collected in April (Table 4). According to our data, the STs exhibited five major expression patterns: profile0 (ST17, ST18 and ST25), profile1 (ST8), profile4 (ST12 and ST19), profile9 (ST2, ST11, ST14, ST22 and ST24) and profile17 (ST23).The transcriptional levels of most STs were decreased, as observed for profiles 0, 1, 4 and 9, and genes related to basal metabolism and photosynthesis were enriched in these profiles, (Table S7). S. japonica contains more ST genes than other sequenced brown algae. The number of annotated ST sequences in S. japonica genome is approximately 3-to 8-fold higher than that in E. siliculosus, N. decipiens, and C. okamuranus (Table 1) for catalysis and cosubstrate binding, respectively [30]. Despite the change of basic groups, the conservativeness of key sites indicated the similarity of the catalytic mechanism between S. japonica group I ST proteins and ST from plants and bacteria. In the other four groups, we did not find these highly conserved motifs. Their active sites need to be further explored with methods of site-directed mutation etc.

Discussion
While distal blades contain higher fucoidan content [8], most STs are more highly expressed in the basal than in the distal blades. The basal blades of S. japonica have a strong ability to divide and produce more new cells than other tissues [31]. Considering fucoidan is a component of cell wall [2], the higher expression pattern of ST genes at basal blades may synthesize more fucoidan to meet the needs of new cells at the basal blades. From January to June, seawater temperature in the northern hemisphere gradually increases. High temperature leads to slower growth, development and metabolism of S. japonica. Some STs (eg. ST8, 20 , 21 , 24 ), which enriched in the same profile as those related to oxidative phosphorylation and metabolic pathways, therefore showed a downregulated trend. In addition, in mid-April, kelp enters the breeding season. Honya et al. (1999) found that S. japonica synthesized fucoidans with a higher sulfated content during its reproduction period, which played an important role in the formation and setting of spores [32,33]. We speculated that those STs which showed an up-regulated trend (eg. ST5, 12,18,25) can meet the reproductive needs of kelp. At the same time, protein degradation in kelp accelerated [34]. Correspondingly, nitrogen and sulfur metabolism were strengthened, and genes related to nitrogen and sulfur metabolism were also  Table 5.

RNA extraction and cDNA synthesis
Total RNA was isolated from the samples using a FastPure Plant Total RNA Isolation Kit (Vazyme, China). The extracted RNA was quantified using a Nanodrop 2000 Spectrophotometer (Thermo Scientific, USA). RNA quality was assessed by 1% agarose gel electrophoresis. First-strand cDNA was synthesized using HiScript II Q Select RT SuperMix (Vazyme, China) and stored at −20 °C for subsequent gene cloning and quantitative realtime PCR (qRT-PCR) analyse. All manipulations were operated following the manufacturers' instructions.

Identification of sulfotransferase family members in S. japonica genome
Based on our previously sequenced S. japonica genome (NCBI: MEHQ00000000) and recently released S. japonica transcriptome data (NCBI: PRJNA512328), we identified 73 genes automatically annotated as sulfotransferase ( ST) genes that catalyze the last step of fucoidan biosynthesis. These 73 genes were submitted to SMART (http://smart.emblheidelberg.de/) [35,36] and Pfam (http://pfam.xfam.org/search) [37] to confirm the presence of the conserved sulfotransferase domain. The predicted genes without the sulfotransferase domain and monthly average FPKM < 0.5 were excluded.
Structure of the ST genes was analyzed by Gene Structure Display Server (GSDS v2.0; http://gsds.cbi.pku.edu.cn/) [41]. The online software MEME (server v5.0.5, http://memesuite.org/) was used to identify conserved motifs in the ST proteins with the following parameters: any number of repetitions, maximum of 20 motifs and an optimum motif width of 6 -200 amino acid residues [42]. The chromosomal positions of the ST genes were acquired by aligning the full-length ST nucleic acid sequences to the S. japonica genome. We used MCScanX to search for duplicate genes in the ST family [43]. TBtools was used to display the chromosomal positions of STs and their relative physical distances [44].

Identification of alternative splicing events
Tophat 2.1.1 was used to analyze alternative splicing events in the 27 ST genes [45].
Alternative splicing sites supported by less than five reads were filtered out, and the remaining were mapped to known alternative splicing sites, allowing for 1 bp error. The known alternative splicing sites were identified and the unmapped new alternative splicing sites were classified again. Refer to Fig. S2 for details.

PCR amplification and sequencing of the ST genes
We randomly selected four genes (GENE_011842, GENE_026617, GENE_013439 and GENE_014314) for PCR amplification. Primers used to amplify the full-length cDNA sequences of these four genes are listed in Table 6

Sequence alignment and phylogenetic analysis
All the 27 S. japonica ST proteins were aligned by Clustal W (https://www.genome.jp/toolsbin/clustalw) with the default parameters [46]. To analyze the evolutionary relationships among the 27 STs in S. japonica, a neighbor-joining (NJ) phylogenetic tree was constructed based on their full-length amino acid sequences with MEGA 7.0.26 using the p-distance method with 1000 bootstrap replications, Gamma 3, partial deletion, and 50% site coverage as the cutoff value [47].
The amino acid sequences of STs derived from E. siliculosus, C. okamuranus, Phaeodactylum tricornutum and S. japonica were subjected to phylogenetic analysis (Table   S8). The maximum likelihood (ML) phylogenetic tree was constructed by MEGA 7.0.26 using the full-length amino acid sequences of 63 ST proteins with 1000 bootstrap replications, the WAG+F+G model, Gamma 3, partial deletion, and 50% site coverage as the cutoff value [47].

Transcript profiling of the ST genes in different tissues and developmental stages
Differentially expressed genes (DEGs) across samples were identified by the edgeR package (http://www.r-project.org/ ) [48]. Genes with a fold change ≥ 2 and a false discovery rate (FDR) < 0.05 were considered as significant DEGs.
The transcriptional profiles of the S. japonica ST genes in different tissues and developmental stages were determined. Transcript data of the ST genes in each sample were obtained, normalized, and clustered by the Short Time-series Expression Miner software (STEM) [49]. Significant clustered profiles had p-value ≤0.05. The DEGs in all profile were subjected to Gene Ontology (GO) and KEGG pathway enrichment analysis, and enriched GO terms or pathways with Q value ≤0.05 were considered to be significant. The heatmap of ST gene expression was drawn by TBtools [44].

Availability of data and material
All of the datasets supporting the results of this article are included within the article and its Additional files.

Competing interests
The authors declare that they have no competing interests.

Funding
This work was supported by the National Natural Science Foundation of China (no.  The multiple sequence alignment and secondary structures of 27 ST S. japonica amino acids (partial).
28 Figure 4 29 Two highly conserved motifs and sites in group I. They were respectively marked with black lines and triangles.    Table S4.xlsx   Table S1.xlsx