Reproduction is a complex trait and the complexities of its associated processes indicate that there is more to predict male fertility than the handful of criteria such as motility and morphology, currently in use. Omics technologies offer great promise for improving the reproductive efficiency of males in a cost-effective and non-invasive manner e.g. through predictive measures and identifying the subfertile phenotypes. We presented a straightforward methodology for discovering the various populations of proteins that embellish the buffalo sperm surface and play crucial roles in sperm function (Supplementary Fig. 2). The in-depth functional proteome was deciphered through label-free shotgun proteomics (LC-MS/MS) which has not been explored earlier. The identification of 1342, 678, and 982 distinct proteoforms (FDR < 0.01) in the salt-extracted, PI-PLC treated and capacitated samples, respectively not only reflects the disparity in the interaction of these proteins with the buffalo sperm surface but also the varied influence of the extraction treatments in the removal of non-covalently attached, GPI-anchored and capacitation associated proteins (Supplementary sheet-Results.xlsx). Our study provides a repertoire of sperm surface proteins that will be valuable to decipher the maturational and functional aspects of sperm biology. The overall GO and pathway functional annotation analysis revealed a rich proteomic background covering a myriad of functions. As expected, there was an enrichment of proteins involved in the regulation of fertility, spermatogenesis and zona-binding or penetration, Most of the identified proteins were relevant in the context of male fertility and provide a platform for further studies on sperm biology.
We have previously reported the removal of sperm surface protein by these treatments [13]. However, we included capacitation as a natural method of protein shedding in this study and assessed five distinct sperm function parameters. Several of these parameters have reportedly been associated with male fertility and sperm health [41–44]. The effect of time and medium on function and protein extraction is more pronounced in the spermatozoa because of their elevated susceptible to environmental insults vis-à-vis the female gamete and other somatic cells. This is ascribed to the high unsaturated fatty acids (PUFAs) content in the plasma membrane, exceptionally high surface area/volume ratio (> 50:1) and near absence of repair, proofreading and protective enzymes [45–47]. Moreover, despite using N = 12 biological replicates, stringent search criteria and statistical rigour, our previous study had limitations in terms of limited technical replicates and a short MS run of 60 min [13]. Therefore, to increase the identification depth and confidence, we used this N = 12 biological replicates (minimum 4 ejaculates each, pooled) and additionally n = 3 technical replicates for each sample group apart from increasing the length of MS/MS run to 100 min. and incorporated the proteins released at the time of capacitation, in this study. A wide range of diversity was observed in the more than 1600 proteins and isoforms predicted to be present on the buffalo spermatozoa vis-à-vis their molecular weight, pI, distribution of expression across chromosomes and functional roles (Fig. 3), as reported in other species [48–54]. Several proteins were found to be unique to each of the treatment groups demonstrating indicating their disparate effects on disrupting the non-covalent (2X-DPBS) or covalent interactions (PI-PLC) among the proteins of sperm surface. This remarkable diversity in the proteome is reminiscent of any mammalian cell and sperm is no exception [2, 33, and 37]. Our computation strategy involved the buffalo NCBI RefSeq database which was rich in data on gene isoforms. The translated database used as search space thus included data on proteoforms as well. The interplay between various isoforms of the same protein has been implicated in the regulation of many crucial cellular processes. For instance, a wide spectrum of protamine proteoforms has been discovered in human sperm chromatin that is implicated production of infertile phenotypes. These include full-fledged proteins and truncated and post-translationally modified proteoforms [56]. Thus, the buffalo sperm surface proteome carries information about spermatogenesis, sperm function and paternal contribution to post-fertilization events in the FRT. The factors responsible for high expression from chromosome 5 and the minimal contribution of chromosome 13 (Fig. 6) to the buffalo sperm-surface protein are not yet known. The chromosome length appeared to dictate the gene density in buffalo since the size of chromosome 13 is 30% shorter than the chromosome (Bubalus bubalis, reference genome-UOA_WB_1) explaining lesser expression. Besides, we incorporated peptides rather than proteins in mapping analysis given that various subunits of a single protein complex may be expressed from distinctive chromosomes. For example, the two subunits of the human RBC metabolic enzyme, glucose-6-phosphate dehydrogenase (G6PD) are encoded by two distinct structural genes present on chromosome X and chromosome 6 to give rise to a fully functional enzyme, either by cross-translation of two mRNAs, or transpeptidation [56]. Nevertheless, further studies on whole sperm proteomic analysis of spermatozoa would shed light on this aspect.
Various studies in multiple mammalian species have indicated that the sperm proteins are reflective of the quality of semen or spermatozoa, can indicate the outcome of fertilization and embryogenesis and even identify the subfertile phenotypes [2, 33, 49–51]. Most of these studies have, however, compared seminal plasma, total or sub-populations of spermatozoa or their extracts or have only used regression models to identify stable fertility biomarkers [56–59]. The shotgun proteomics (LC-MS/MS) approach used in this allowed us to determine the effect of extraction treatments and helped to attain an unparalleled depth of 1695 buffalo-specific proteoforms. This increased depth contrasts with prior previous proteomic studies on buffalo sperm that have used less powerful methods e.g. 2D gel electrophoresis coupled to MS, which although provided valuable insights into protein patterns of fertility, however, addressed a limited number of proteins due to technical limitations [7, 28, 33 and 50]. The label-free quantification (LFQ) used in this study has been demonstrated to be better than the laborious gel-based techniques because of its better accuracy, less variation and experimental errors, and lower sample requirements [60]. The results obtained in this study reveal minimal variations across the sample loadings and among the technical replicates used for the three extraction treatments [Fig. 6C]. This is in concordance with other proteomic studies, wherein protein extraction and digestion have been demonstrated to be the major sources of technical variability in LC-MS proteomics [61].
One limitation of this study is that some intracellular proteins were also identified along with the surface proteins (Supplementary sheet-Results.xlsx). However, as expected, the integrity of the plasma membrane is likely compromised, with time, in a few sperm cells thus leading to the release of intracellular proteins in the incubation medium. The sub-cellular compartments of mammalian spermatozoa are known to be highly dynamic architectures which are sensitive to extraction treatments [19]. Nevertheless, it is worth mentioning that many proteins hitherto thought to be intracellular have now been identified on sperm surface e.g. HSP60, PGK2, AK1 [62, 63]. Furthermore, it has been reported that many core intracellular proteins e.g. nucleosomal histones (H1, H2A, H2B, H3 and H4) can directly translocate across the cellular membrane and act as carriers for the import of extracellular macromolecules into living mammalian cells [64].
We employed BLAST2GO 5.0 (OmicsBox) and KEGG Orthology Based Annotation System (KOBAS 3.0) for functional and pathway annotation of the identified proteins considering the lack of annotated bio-molecular data for buffalo [65, 66]. As mentioned earlier, buffalo is not listed in the commonly used gene/protein annotation software, gene ontology and functional annotation tools. The GO term cellular component indicated the enrichment of proteins localized to the plasma membrane, cell-periphery, and acrosome vesicle or secreted in the extracellular space. The identified proteins were predicted (by BLAST2GO) to be primarily involved in the regulation of metabolism, immune system (immunomodulation) and reproductive processes (e.g. gamete interaction) and a majority of them possessed one or other catalytic domains (Fig. 7). These findings indicated successful extraction of sperm-surface, plasma membrane proteins by the method used in this study. These observations are in agreement with the earlier studies that have also presented the evidence of expression of genes in spermatozoa that are crucial to sperm functional attributes, its maturation, survival, fertilization, and embryonic development (Fig. 7) [67–69]. Cross-species sequence-similarity mapping by KOBAS analysis corroborated the enrichment of various metabolic-regulatory pathways and those associated with immune-related and cellular homeostasis (Fig. 8). Overall, to the best of our knowledge, this is the first time that > 1500 non-covalently (electrostatically) linked, GPI-APs and capacitation-related proteoforms (mostly un-annotated) from buffalo sperm surface have been identified and functionally annotated by extrapolation of function from their respective annotated sequence orthologs (Figs. 9 and 10).
The results of this study again underpin the intricate relationship which entwines innate immunity, immune-evasion and and sexual reproduction (10, 13). During the transit of spermatozoa in the MRT, several surface proteins are either removed or modified and numerous novel antigens are either adsorbed or inserted in the spermatozoa plasma directly or associate with existing binding proteins e.g. clusterin, lipocalins and GPI-APs [70–73]. Such adhered proteins are thus present in a peripheral cellular environment bound to the sperm surface likely by the electrostatic/hydrophobic interactions. Several of these secretagogues, including BDs like SPAG 11, Bin1b and DEFB-126 are thus applied as a peripheral coat onto the surface of a mammalian spermatozoon during its passage through the MRT and are often reported to be overrepresented [73, 74–77]. The abundance of immune-related proteins has also been associated with the stabilization of the sperm membrane during the immune attack by immune cells and immunomodulation, particularly in the uterine lumen [59, 78 and 79]. For instance, we identified 11 unique DEFBs (β-defensins) in our study viz. DEFB-107A-like, 110, 112, 113, 114, 116, 119, 121, 123, 43-like and DEFB-129 (BuBD-129). Such a high abundance of host-defence peptides (HDPs) is in agreement with previous ‘omics studies that have also demonstrated a high expression of DEFB genes(in the epididymis) known to be implicated in evading the elicited immune responses(protection) and regulation of sperm function [80, 81]. Many of the spermatozoa transcripts of β-defensins such as DEFB-1, 7, 123,124,119 and DEFB-110 have been reported to be highly expressed in HF bulls and thus implicated in regulating fertility (Fig. 9). A pertinent question that arises from our study on buffalo is “Why so many”? The DEFBs (β-defensins) have been proposed to be crucial for the health of the MRT by providing a rapid, broad-spectrum response to disparate pathogens and are also required for interacting with immune cells in the FRT, functioning as chemokines or transduction of the information stored in their abundant O-linked glycans [21, 82]. It has been proposed that sperm competition, sexual conflict and sexual selection in combination or individually could provide the rapid adaptive evolution observed in reproduction-related genes [83–84]. An asymmetric evolution after tandem gene duplications appears to be the case for BuBDs, which have shaped the evolution of the β-defensin gene family in bovids including buffalo. This phenomenon is quite common in mammalian evolution contributing to varying gene numbers in different species [10, 79]. Alternatively, the high number of BD genes in buffalo as observed in cattle, especially on the sperm surface may be due to the domestication process that has subjected the animals to higher population densities vis-à-vis the wild bovids. This would have exposed them to more numbers of disparate microbes because herd structure and the polygynous mating system promote rapid transmission of the diseases [16–18, 85]. The seminal β-defensins also protect the sperm from LPS-mediated inflammation that is known to reduce motility [86]. Interestingly, BuBD-129 was the only β-defensin common to all the extraction methods. This novel defensin has been demonstrated to embellish the entire buffalo sperm and was predicted to be heavily O-glycosylated [10, 13]. This explains the anomaly between the observed band (~ 35-40kDa) and computed molecular weight (MW) which is ~ 19kDa. (Fig. 6D) Similar effects of O-glycosylation have been reported to cause a shift in the observed molecular weight [87–89]. Preliminary data such as ours indicate that the novel β-defensin, BuBD-129 is among the prominent buffalo sperm coat proteins, which could be a potential fertility biomarker [73, 87 and 89]. Future studies on elucidating its exact physiological function would unravel its pinpoint role in the modulation of fertility.