3.1. Differentially expressed genes (DEGs) in the hepatopancreas of S. paramamosain
A total of nine samples were used in this analysis. The Q30 of raw data for each sample was distributed from 93.00 to 94.09%, the effective data was distributed from 6.97 to 8.06 G, and the average GC content was 45.90% (Additional file: Table S3). Unigene 54537 pieces were spliced with a total length of 60890086 bp and an average length of 1116 bp (Additional file: Table S2). As the genome sequencing of S. paramamosain has not yet been elucidated, after reads filtering, Trinity was used to perform de novo assembly with clean reads. Tgicl was used on cluster transcripts to remove abundance and obtain Unigenes.
The reads were compared to Unigene and the alignment was 90.63-93.10% (Additional file: Table S3). Stochasticity evaluation of sequencing data showed that the distribution of reads in the various parts of the gene was relatively uniform, indicating that the randomness of the break was good. The RPKM values of the individual genes determined under each 5% increment were compared to the expression levels of the final corresponding genes. The results differed by less than 15%, and therefore, sequencing data met quantitative requirements. The distribution of FPKM mean values for Unigenes ranged from 11.50 to 13.33 (Additional file: Fig. S1). Inter-sample correlation tests confirmed that the study was reliable and the sample selection was reasonable. Principal component analysis showed that each group of samples was distributed in different regions, and the same group of samples was concentrated spatially. Cluster analysis also showed that similar samples grouped together (Fig. 3).
Finally, high-quality transcripts were obtained and used as reference sequences. For database annotations of Unigenes, 16049 (29.43%) genes were annotated into the NR library, 11876 (21.78%) genes were annotated into the Swissprot library, 7881 (14.45%) genes were annotated into the KEGG library, 10586 (19.41%) were annotated into the KOG library, 13351 (24.48%) genes were annotated into the eggNOG library, 11204 (20.54%) genes were annotated into the GO library, and 284 (0.52%) genes were annotated into the Pfam library (Additional file: Table S4). For functional annotation results, there were 54287 SSRs predicted, including 25,440 Unigenes containing SSRs, 13,056 Unigenes containing more than one SSR and 11,842 composite SSRs (Additional file: Table S5). A total of 29,221 CDS sequences were predicted, of which 16111 were predicted by the database comparison method and 13110 were predicted by ESTScan. A total of 1672 Unigenes were annotated to the transcription factor database and distributed across 63 families.
To compare the difference between the experimental group and control group in terms of the change in global proteomes in the hepatopancreas, volcano plots were created of P-value (-log10 P-value) versus the log2fold change for each gene analyzed. Accordingly, relative to the control group, the LL and HL group exhibited a change of a certain proportion of proteins (Fig. 2). A total of 5052 differentially expressed Unigenes were obtained between the LL group and the control, including 3104 up-regulated Unigenes and 1948 down-regulated Unigenes, while 7403 differentially expressed Unigenes were obtained between the HL group and the control, including 5262 up-regulated Unigenes and 2141 down-regulated Unigenes (p value<0.05 & |log2FC|>1) (Fig. 1A). Interestingly, 2480 differentially expressed Unigenes were intersected (Fig. 1B).
It is well known that carbon metabolism plays a key role in metabolic regulation. An important pathway in carbon metabolism is oxidative phosphorylation (ko00190), in which all significantly different genes are down-regulated. In the LL group, the three most highly down-regulated genes were NADH-ubiquinone oxidoreductase chain 3 (ID: P18930; 0.21-fold), NADH-ubiquinone oxidoreductase chain 2 (ID: P34848; 0.24-fold), and V-type proton ATPase subunit F (ID: Q1HQK8; 0.31-fold) (Table 1). In the HL group, the three most highly down-regulated genes were NADH-ubiquinone oxidoreductase chain 2 (ID: P34848; 0.10-fold), cytochrome c oxidase subunit 1 (ID: P00399; 0.17-fold), and NADH-ubiquinone oxidoreductase chain 4 (ID: Q34048; 0.17-fold) (Table 1). Two important pathways in energy metabolism are glycolysis / gluconeogenesis (ko00010) and the citrate cycle (TCA cycle, ko00020). In this process, only pyruvate dehydrogenase E1 component (ID: P49432) was significantly up-regulated. The fold change was 3.26 in the LL group and 2.64 in the HL group. In the LL group, the three most highly down-regulated genes were glyceraldehyde-3-phosphate dehydrogenase (ID: P56649; 0.25-fold), hexokinase type 2 (ID: Q9NFT7; 0.28-fold) and enolase (ID: P56252; 0.34-fold). In the HL group, the three most highly down-regulated genes were triosephosphate isomerase B (ID: Q90XG0 0.33-fold), aconitate hydratase (ID: Q99KI0; 0.37-fold), and glyceraldehyde-3-phosphate dehydrogenase (ID: P56649; 0.38-fold) (Table 1). Glycine, serine and threonine metabolism (ko00260) and beta-Alanine metabolism (ko00410) are important pathways in amino acid metabolism. 2-amino-3-ketobutyrate coenzyme A ligase (ID: O75600; LL: 2.13-fold, HL: 2.49-fold) and cystathionine gamma-lyase (ID: Q8VCN5; LL: 4.08-fold, HL: 16.10-fold) were up-regulated in both pathways. Betaine--homocysteine S-methyltransferase 1 (ID: Q5XGM3) was down-regulated in the LL group (0.53-fold) and up-regulated in the HL group (7.34-fold). In the LL group, the three most highly down-regulated genes were cytosolic non-specific dipeptidase (ID: Q96KP4; 0.13-fold), peroxisomal sarcosine oxidase (ID: Q29RU9; 0.16-fold), and aldehyde dehydrogenase (ID: P81178; 0.25-fold). In the HL group, the three most highly down-regulated genes were serine--pyruvate aminotransferase (ID: P41689; 0.19-fold), cytosolic non-specific dipeptidase (ID: Q96KP4; 0.21-fold) and D-3-phosphoglycerate dehydrogenase (ID: A5GFY8; 0.30-fold) (Table 1). Two important pathways in lipid metabolism are fatty acid elongation (ko00062) and fatty acid degradation (ko00071). Lysosomal thioesterase PPT2 (ID: O70489; LL: 3.47-fold, HL: 3.12-fold) and lysosomal thioesterase PPT2-B (ID: Q6GNY7; LL: 1.97-fold, HL: 3.00-fold) were up-regulated in both the LL and HL groups. In the LL group, fatty aldehyde dehydrogenase (ID: P47740) was the most down-regulated gene (0.31-fold), followed by alcohol dehydrogenase class-3 (ID: P79896, 0.38-fold) and hydroxyacyl-coenzyme A dehydrogenase (ID: Q16836; 0.44-fold) (Table 1). In the HL group, the three most down-regulated genes were hydroxyacyl-coenzyme A dehydrogenase (ID: Q16836; 0.35-fold), alcohol dehydrogenase class-3 (ID: P79896; 0.40-fold) and enoyl-CoA delta isomerase 2 (ID: Q9WUR2; 0.41-fold) (Table 1).
3.2. Functional annotation
Gene Ontology (GO) terms could be divided into three ontologies: cellular components, molecular function, and biological processes. In the LL group, 49 GO terms were assigned to the up-regulated group and 54 to the down-regulated group (Fig. 4A). In the HL group, 48 GO terms were assigned to the up-regulated group and 52 to the down-regulated group (Fig. 4B). A total of 23 processes were identified in the biological process category, comprising three biological processes (cellular process, metabolic process and biological regulation) as the most strongly affected in the hepatopancreas of S. paramamosain under different light intensities (Fig. 4). In the cellular component category, cells, cell parts, and membranes were most involved (Fig. 4). In the molecular function category, two items comprising binding and catalytic activities were most involved (Fig. 4).
3.3. Metabolic relatedDEG pathway analysis
KEGG pathway classification (Fig. 5) was performed according to the KEGG database website. All DEGs were graded into five categories according to their biological function, including cellular processes (LL, 150; HL, 175), environmental information processing (LL, 119; HL, 121), genetic information processing (LL, 198; HL, 309), metabolism (LL, 426; HL, 519) and organismal systems (LL, 227; HL, 247) (Fig. 5).
The pathways related to metabolism were subdivided into twelve subsets (level 2): xenobiotic biodegradation and metabolism (LL, 23; HL, 25), nucleotide metabolism (LL, 18; HL, 26), metabolism of terpenoids and polyketides (LL, 5; HL, 7), metabolism of other amino acids (LL, 24; HL, 26), metabolism of cofactors and vitamins (LL, 37; HL, 34), lipid metabolism (LL, 63; HL, 66), glycan biosynthesis and metabolism (LL, 27; HL, 36), global and overview maps (LL, 42; HL, 57), energy metabolism (LL, 55; HL, 99), carbohydrate metabolism (LL, 72; HL, 75), biosynthesis of secondary metabolism (LL, 4; HL, 3) and amino acid metabolism (LL, 56; HL, 65) (Fig. 5A&B and Additional file 1: Table S6).
Functional enrichment was also performed on DEGs according to the above KEGG pathway classification. The top 20 pathways showed that there were 15 and 12 pathways directly related to metabolism in the LL and HL groups, respectively. The most common metabolic pathways were biosynthesis of amino acids (ko01230), degradation of aromatic compounds (ko01220), carbon metabolism (ko01200), methane metabolism (ko00680), oxidative phosphorylation (ko00190) and glycolysis/Gluconeogenesis (ko00010) (Fig. 5C&D). The pathways related to metabolism in the LL group were pentose and glucuronate interconversions (ko00040), drug metabolism-cytochrome P450 (ko00982), starch and sucrose metabolism (ko00500), metabolism of xenobiotics by cytochrome P450 (ko00980), porphyrin and chlorophyll metabolism (ko00860), retinol metabolism (ko00830), glycine, serine and threonine metabolism (ko00260), steroid hormone biosynthesis (ko00140) and ascorbate and aldarate metabolism (ko00053) (Fig. 5C). The pathways related to metabolism in the HL group were caprolactam degradation (ko00930), carbon fixation in photosynthetic organisms (ko00710), aminobenzoate degradation (ko00627), geraniol degradation (ko00281), fatty acid elongation (ko00062) and the TCA cycle (ko00020) (Fig. 5D).
Seven important metabolic pathways were selected for further analysis: Glycine, serine and threonine metabolism (ko00260), beta-Alanine metabolism (ko00410), fatty acid elongation (ko00062), fatty acid degradation (ko00071), oxidative phosphorylation (ko00190), glycolysis / gluconeogenesis (ko00010) and the TCA cycle (ko00020) (Table 1). We focused on these pathways and the differential genes they contain.
3.4. Validity of DEGs in transcriptomic data
To validate the transcriptome results, qRT-PCR was used to check the transcript levels in six identified DEGs (PPT2, ODPB, KBL, ATP5H, G3P, and SPYA). As indicated by qRT-PCR, in the six genes selected in a random manner, the majority of the genes exhibited a similar transcription level expression to the transcriptome results in different light intensities, with the exception of G3P in the HL group and SPYA in the LL group (Fig. 6). The qRT-PCR results showed a high level of consistency with the transcriptome data obtained from transcriptome, indicating that the transcriptome results were reliable.