Genome‑wide DNA methylation analysis of sperm
To compare the genome-wide DNA methylation profiles of sperm samples of the weak group (DFI>30%) and normal group (DFI<15%), 6 weak group sperm samples and 7 normal group sperm samples were examined using WGBS (Table 1). After quality control and data preprocessing, all samples were found to have a bisulfite conversion rate greater than 96%.
Compared with the normal group sperm samples, the global DNA methylation level of the weak group sperm samples showed a downward trend (Figure 1A). We then performed DMRs analysis to determine epigenetic differences between the two groups. A total of 4939 diferentially methylated regions (DMRs) (3083 hypermethylated and 1856 hypomethylated) were identifed in the weak group sperm samples relative to the normal group sperm samples (Table S1) , with 2072 of them (41.95%) located in promoter regions (1 to 3000 bp upstream or downstream of transcription start site) (Figure 1B). The percentages of hypermethylated DMRs were higher than hypomethylated DMRs in all of the seven examined gene annotation groups (Figure 1C).
The top 300 DMRs included 7921 CpGs with the median length of 1282.5 bp, and most of them were located at promoter regions. We were able to separate sperm samples into two groups corresponding to DFI>30% and DFI<15% (Figure 1D) using the top 300 DMRs, thus these 300 DMRs may be potential biomarkers for sperm quality.
To characterize the functional relevance of the 3083 hypermethylated DMRs, all of the 3083 hypermethylated DMRs were associated with their nearest genes and the gene ontology(GO) and pathway enrichment analysis was performed. The 3083 hypermethylated DMRs were found to be mainly signifcantly enriched in the area of neurons, such as axonogenesis, regulation of neuron projection development, synaplic membrane, axon guidance, neuron projection guidance and distal axon(Figure 1E). These findings suggest that the increase of sperm DFI level may affect embryonic nervous system development by causing epigenetic dysregulation of genes associated with neurons. In addition, the 3083 hypermethylated DMRs were also found to be signifcantly enriched in the area of microtubule (Figure 1E) which is an important part of the 9+2 structure of the sperm tail and is very important to ensure sperm motility. Since the sperm motility of weak group sperm was significantly lower than that of the normal group sperm (Table 1), the increase of sperm DFI level may affect sperm tail structure and sperm motility by causing epigenetic dysregulation of genes associated with microtubule.
Six representative DMRs (CD14 cluster, TENM3 cluster, MB21D2, DAPL1 cluster, GLT1D1 cluster and ZNF516 cluster) identifed by WGBS are shown in Figure 2.
Chromosome Compartments Analysis
Because the global DNA methylation level of the weak group sperm samples showed a downward trend, we speculated that there might be differences in the spatial conformation of chromosomes between the two groups of samples. We constructed the chromosome compartments of sperm from the two groups using the WGBS data, and found that the compartments of five chromosomes(13, 4, 5, 21 and Y) in the weak group sperm changed compared with the normal group, and the correlations of methylation within the five chromosomes were weakened(Figure 3) which suggested that the structure of the five chromosomes of the weak group sperm become loose. Therefore, the chromosomes of the weak group sperm is more vulnerable to ROS attacks and more prone to break. Among these five chromosomes, the chromosome compartments and the correlations of methylation of chromosome Y changed the most between the two groups(Figure 3I, 3J), which may suggests that elevated DFI levels produce more damage on sperms with chromosome Y than chromosome X.
sncRNAs deep analysis
We extracted total RNA from 13 weak group sperm samples(DFI>30%) and 17 normal group sperm samples(DFI<15%) (Table 1), and analyzed the expression of sncRNAs by deep sequencing. We found that rsRNAs, tsRNAs, yRNAs and miRNAs were abundant in sperm samples. On average, about 40.5% of the sncRNAs annotated to rsRNAs, 19.3% to tsRNAs, 10.4% to yRNAs, and 7.1% to miRNAs (Table 2, Table S2). The length distribution of these sncRNAs was similar in each sample in both groups (Figure 4). The peak of the tsRNAs, rsRNAs, and miRNAs length ranged from 17 to 40 nt, 17 to 40 nt and 20 to 23 nt, respectively. We analyzed the proportions of miRNAs, tsRNAs, rsRNAs, sn/snoRNAs, yRNAs, and piRNAs in each sample (Figure 5A) and found that the proportions of these sncRNAs were not significantly different between the two groups (Figure 5B).
A total of 632 miRNAs (average RPM > 10) were detected in sperm samples (Table S3), of which 27 miRNAs were differently expressed between the two groups (9 up-regulated, 18 down-regulated) (Figure 6A, 6B, 6C and Table S4). Figure 6D shows the difference in the expression of the top 10 miRNAs by average expression between the two groups. Furthermore, we found that a principal component analysis (PCA), which is a powerful tool for exploratory data analysis and generating predictive models, could separate the weak group sperm samples from the normal group based on these 27 differently expressed miRNAs (PC1=48.72%, PC2=19.14%) (Figure 10A). These results indicated that the 27 tsRNAs have an excellent prognostic value and can be potential biomarkers for assessing human sperm quality. Moreover, the target genes of 16 of the 27 differentially expressed miRNAs were predicted by TargetScan. An analysis of significant GO-enriched terms showed that target genes of both the downregulated and upregulated miRNAs were involved in the nervous system and cell development(Figure 7), indicating that these miRNA target genes might be important for early embryo nervous system development and other systems development.
tsRNAs were high expressed in human sperm, and a total of 3612 tsRNAs (average RPM > 10) were detected in sperm samples(Table S5), of which 151 tsRNAs were differently expressed between the two groups (76 up-regulated, 75 down-regulated) (Figure 8A, 8B, 8C and Table S6). Figure 8D shows the difference in the expression of the top 10 tsRNAs by average expression between the two groups. PCA classifier analyses showed that these 151 differently expressed tsRNAs could also classify the samples into two groups (PC1=28.79%, PC2=23.57%), indicating that these tsRNAs have comparable predictive power and may be another type of useful biomarker for the clinical evaluation of sperm quality(Figure 10B).
We found that rsRNAs is the most highly expressed sncRNAs in human sperm. A total of 10707 rsRNAs (average RPM > 100) were detected in sperm samples(Table S7), of which 70 rsRNAs were differently expressed between the two groups(42 up-regulated, 28 down-regulated) (Figure 9A, 9B, 9C and Table S8). Figure 9D shows the difference in the expression of the top 10 rsRNAs by average expression between the two groups. PCA classifier analyses showed that these 70 differently expressed rsRNAs can be used for separating the weak and normal group sperm samples(PC1=47.06%, PC2=18.13%) (Figure 10C).
Finally, we identified nine sncRNAs as candidate sperm quality biomarkers(Table 3). PCA classifier analyses showed that these nine sncRNAs can be used for better separating the weak and normal group sperm samples(PC1=34.02%, PC2=26.28%)(Figure 10D). In the future, we also need to verify whether these nine sncRNAs can be used as sperm quality biomarkers on the basis of larger sample size. True sncRNAs biomarkers not only require the ability to distinguish the high DFI sperm from the low DFI sperm, but more importantly, can accurately predict the pregnancy outcome.