Overall transcriptome and sequencing data
To better understand P. yessoensis sex determination and differentiation mechanisms, we conducted a comparative transcriptomic analysis. A total of 18 libraries were constructed and named as follows: PyUmf-1 to 3, PyUmm-1 to 3, PyMf-1 to 3, PyMm-1 to 3, PyOf-1 to 3, and PySm-1 to 3. Gonad development stage was identified via histological methods, and hermaphrodites were excluded. Unfortunately, one of the four individual gonad development stages of PySm-1 was not accurately defined, which affect the trends of overall data, therefore, we excluded the data of PySm-1. As a result, a mean of 50,354,447 filtered clean reads with Q20 above 97.16 % was obtained from each library and 1.55-2.75 % of low quality reads mapped with rRNA were removed.
Identification of DEGs
Principal components analysis revealed strong clustering associated with sex, excluding sample PySm-1 (Fig. 1a). PC1 accounted for 44 % of the differential expression and was correlated with sex. However, different development stages did not cluster obviously, as PC2 accounted for only 24.4 % of the differential expression. In the sample relationship analysis, a heatmap plot revealed that sex identification of samples was exact except for the PySm-1 group (Fig. 1b). Therefore, we did not include the RNA-Seq data of sample PySm-1 in later analyses.
In the differential expression analysis, 3,412 genes in PyUmf-vs-PyUmm, 2,909 genes in PyOf-vs-PySm, and 2,778 genes in PyMf-vs-PyMm were found to differentiate between females and males. Fewer differential genes were found between PyUmf and PyMf, PyMf and PyOf, PyUmf and PyOf, PyUmm and PySm, PyUmm and PyMm, and between PyMm and PySm, with 1,504, 920, 381, 249, 42, and 14 differential genes, respectively (fold change ≥2, P < 0.05). DEGs were very abundant between females and males (Fig. 2). However, there were fewer DEGs between different gonad development stages.
Subsequently, a total of 23292 known genes and 2237 novel genes were identified via transcript reconstruction of the P. yessoensis genome. Of the 25529 protein coding genes analyzed, 1,171 showed significant differences between female and male gonads, and there were more female-specific genes (754) than male-specific genes (417) (Additional file 1: Figure S1).
Gene co-expression network interactions differ between the sexes
WGCNA revealed that the genes of female and male scallop gonads can be clustered into 19 modules (Fig. 3), which were further classified and clustered by similarity = 0.75 and minModuleSize = 50, with module sizes ranging from 79 to 4058. Of these modules, the turquoise module with size 1541 and the green module with size 1451 were significantly positively correlated with male gonads and were negatively correlated with female gonads. In contrast, the coral1 module with size 1371 and the black module with size 860 were significantly positively correlated with female gonads and were negatively correlated with male gonads.
Functional enrichment analysis of the genes
To better understand the function of these differentially expressed genes in sex differentiation or determination, GO term enrichment (Fig. 4) and KEGG pathway enrichment were conducted. Seventy-eight GO terms (belonging to biological processes, cellular components, and molecular functions) with p < 0.05 were significantly enriched in the coral1 module, 78 GO terms were enriched in the black module, 160 terms were significantly enriched in the turquoise module, and 237 were enriched in the green module. Most were enriched as cellular processes, metabolic processes, or single-organism progresses, which can all be classified as biological processes. Lots were enriched as cells, cell parts, or organelles, which are classified under cellular components. Lots of terms were also enriched as binding or catalytic activities, which can be categorized as molecular functions. At least 49 genes in five modules were identified as being involved in sex differentiation or determination (Additional file 2: Table S1), and most of these were enriched in the GO function terms mentioned above. In total, out of all of the transcripts, 5030 genes had KEGG assignations. These mainly belonged to metabolic processes, environmental information processing functions, and genetic information processing pathways. More specifically, ribosome biogenesis in eukaryotes, RNA transport, metabolism of xenobiotics by cytochrome P450, degradation of valine, leucine and isoleucine, aminoacyl-tRNA biosynthesis, and beta-Alanine metabolism were significantly enriched in females (Fig. 5 a, b), and purine metabolism, carbon metabolism, biosynthesis of amino acids, oxidative phosphorylation, the TCA cycle, and the fanconi anemia pathway were significantly enriched in males (Fig. 5 c, d). All of these GO term enrichments and KEGG pathway enrichments were closely related to sex differentiation or determination. Some of them belonged to cellular process pathways, which are relevant to gamete development.
Quantitative real-time PCR
Fifteen sex-biased genes were randomly selected for qRT-PCR to validate differential expressions identified by RNA-Seq. Comparing the transcriptome data with the qRT-PCR results, we found that the patterns of transcript abundance detected for genes using qRT-PCR were consistent with the RNA-Seq results (Fig. 6). The expression pattern of ncbi_110450487(Dmrt 1) (Fig. 6d) was different from the expression patterns of the male-biased genes shown in Figures 6 a, b, c, e, and f. Its expression was also remarkably different from the expressions of the female-biased genes shown in Figures 6 g, h, and i. Therefore, Pydmrt 1 was found to be the only one gene which determined the male sex phenotype during early gonadal differentiation in darkgreen module.
Identification of sex-biased hub genes and network construction
In the turquoise module, the genes in the core positions were ncbi_110441989 (cAMP and cAMP-inhibited cGMP 3',5'-cyclic phosphodiesterase 10A, PDE), ncbi_110442454 (testis-specific serine/threonine-protein kinase 3 –like, tssk-3), ncbi_110466991 (WD repeat-containing protein on Y chromosome-like, WD rcp), and ncbi_110444113 (leucine-rich repeat-containing protein, LRR) (Additional file 3: Figure S2). In the green module, ncbi_110443603 (choline transporter-like protein 1, CTL1), ncbi_110465229 (actin 5C), and ncbi_110441893 (basic helix-loop-helix and HMG box domain- containing protein 1, bHLH) were in the core positions (Additional file 4: Figure S3). Other genes associated with these genes all had the same expression patterns, and were much more highly expressed in male gonads than in female gonads.
In the coral1 module, ncbi_110465748 (polypeptide N-acetylgalactosaminyltransferase 4, GALNT4), ncbi_110454866 (protein ovo-like isoform X4), ncbi_110450517 (Cytochrome P450 1A4-like, CYP1A4-like), ncbi_110450286(forkhead box protein A2-A-like, fox A2-like), and XLOC_015112 (transcriptional regulatory protein TBS1-like, TBS1-like) were in the core positions (Additional file 5: Figure S4). In the black module, ncbi_110456621 (glycoprotein 3-alpha-L- fucosyltransferase A, Fuc-T), ncbi_110447244 (collagen alpha-2(I) chain), ncbi_110440232 (uncharacterized protein LOC105336037), and ncbi_110448784(Vitellogenin, Vg) were in the core positions (Additional file 6: Figure S5). Other genes associated with these genes all had the same expression patterns, and were much more highly expressed in female gonads than in male gonads.
The 49 selected genes were used to construct networks based on their relationship coefficients of expression mode in female and male groups (Fig. 7). In the female group, ncbi_110450517(CYP1A4-like),ncbi_110465748(GALKNT4)and ncbi_110454866(protein ovo-like)are the hub genes, other genes, such as ncbi_110447244 (collagen alpha-2(I) chain), XLOC_015112 (TBS1-like) and ncbi_110456621 (Fuc-T) are positively related to these hub genes. In the male group, ncbi_11450487 (dmrt 1) was a negative hub gene, while ncbi_110441989 (PDE), ncbi_110444113 (LRR), ncbi_110441976 (sox 30), ncbi_110466991 (WD rcp), ncbi_110443603 (CTL 1) and ncbi_110442454 (tssk) were all positive hub genes, ncbi_110463608 (RFX4), ncbi_110463135 (IF-2-like), ncbi_110442133 (MARCH3), ncbi_110442246 (stabilizer of axonemal microtubules, MTs), ncbi_110445444(Ras guanine nucleotide) and ncbi_110445173(Biorientation of chromosomes in cell division protein 1) were closely negatively related to ncbi_11450487 (dmrt 1), but positively related to other hub genes.
Hypothesized sex differentiation pathway in Patinopecten yessoensis
By analyzing the overall gene expression profiles of gonads, at least 49 genes in five modules involved in sex differentiation or determination were identified (Additional file 1:Table S1). This study found that dmrt 1 was in the darkgreen module, and that its expression was much higher in PyUmm than in other male stages, while very low or none in all female gonads. This expression trend was consistent both in q RT-PCR and de novo analysis (Fig. 6), which means that high expression of Pydmrt 1 determines the male sex phenotype during early gonadal differentiation. Further studies should focus on validating the function of dmrt 1. Other selected target genes were identified as transcription factors (TFs), and they were enriched in males or females by comparing all the datasets, especially the correlation coefficients, expression levels, and gonad developmental stage. Therefore, we propose that dmrt 1 may have a leading function in the sex differentiation pathway of P. yessoensis, and that significantly high expression of dmrt 1 directly activates sox 30 (transcription factor SOX-30), LRR (leucine-rich repeat-containing protein), MTs, WD rcp (WD repeat-containing protein on Y chromosome), tssk-3 (testis-specific serine/threonine-protein kinase 3), and cAMP and cAMP-inhibited cGMP 3',5'-cyclic phosphodiesterase (PDE), which were all male-specific genes in the turquoise module. TBX4 (T-box transcription factor 4), CTL 1(choline transporter-like protein 1), Protein B4, and HSFP-like (heat shock factor protein-like) were in the green module and were highly expressed genes. On the other hand, CYP1 A4-like (Cytochrome P450 1A4-like), GALNT 4 (polypeptide N-acetylgalactosaminyltransferase 4), fox A2-like (forkhead box protein A2- like), GATA-1, and ovo-like were in the coral1 module and were female-specific genes. Vg (Vitellogenin) and Fuc-T (glycoprotein 3-alpha-L- fucosyltransferase A) were high expression genes in the black module. Therefore, we propose the following sex differentiation pathway in P. yessoensis (Fig. 8).