Subclinical mastitis caused by different pathogens including S. uberis has no visible symptom making detection harder, and causing significant economic losses. The milk somatic cells indirectly capture the host response to an insult and constitute an effective sample for investing the genetic and epigenetic changes associated with S. uberis subclinical mastitis. Moreover, compared with mammary gland tissue or blood, milk somatic cells are easier to collect without causing extra harm or discomfort to milking cows, thus respecting animal welfare and also easier and cheaper to apply in a larger population. However, milk somatic cells are composed of multiple cell types, each of which may contribute differently to the genetic or epigenetic alterations between S. uberis-positive and healthy groups. Moreover, the small sample size (three cows per group), although adequate for the types of sequencing technologies used , may cause relatively lower statistical power. Therefore, use of single cell sequencing technology and a higher sample size are necessary to validate our findings and the discriminant biomarkers detected in this study. As far as the authors know, this is the first study to profile the whole genome wide DNA methylation patterns of milk somatic cells from cows with naturally occurring S. uberis subclinical mastitis.
The genome-wide DNA methylation pattern of milk somatic cells detected in this study is consistent with other mammals, in that DNA methylation mainly exists in the context of CpG  and with methylation levels greater than 70%. This study investigated the DNA methylation patterns in all three contexts (CpG, CHG and CHH) and found significant differences between CpG sites and CHG/CHH sites. CHG and CHH sites did not show a downward trend of methylation levels at the promoter region typical for CpG sites, but displayed higher methylation levels in exons, especially first exons. CHG and CHH sites also showed higher methylation levels in CGI, where CpG sites are usually unmethylated. This opposite trend in methylation level distribution of CHG and CHH sites from CpG sites suggests that the mechanisms of CHG and CHH methylation involvement in mastitis may be different from CpG methylation, deserving further exploration. A clue to the possible mechanisms of CHG and CHH methylation could not be gained from this study because the integrative analysis of methylome and transcriptome data did not find significant correlations between the expression levels of genes and CHG and CHH methylation patterns at the promoter, first exon or first intron of genes, which may be due to their extremely low methylation levels. As expected, a sharp decrease in methylation level trend was identified for CpG sites in the promoter regions. Furthermore, and at a genome wide scale, the methylation level of the promoter region showed significant negative correlation with gene expression level, thus supporting a repression role of DNA methylation at the promoter region on transcriptional activities [40–42]. Additionally, this study further found that, the higher the expression level of a gene, the lower its methylation level is, at the TSS, and the more intense the decrease in methylation level is at the promoter region (Fig. 5D). This further proved the importance of a low level of methylation at the promoter region for normal gene expression. In addition, CpG sites also showed low methylation level (~ 20%) at the first exon with increasing trend in first intron reaching ~ 80% level in inner exons and introns. The methylation level of the first exon and first intron was negatively correlated with gene expression in this study, a scenario supported by previous investigations [43, 44]. DNA methylation in first exon has been tightly linked to silenced transcriptional activities, indicating that its low methylation level is an important feature for normal gene expression mechanisms . Furthermore, the DNA methylation pattern of alternatively spliced exons has been found to affect inclusion level, and thereby contribute to proteome diversity; and loss of DNA methylation at exons was found to alter splicing patterns . Taken together, our observation and these reports suggest important roles of DNA methylation not only at the promoter region, but also at the first exon and first intron of genes.
The methylation status of milk somatic cells was then compared between S. uberis-positive and healthy control group. Firstly, the CpG methylation landscape among chromosomes indicates globally a lower methylation level in S. uberis-positive group. Specifically, more than half of the identified DMCs had lower methylation levels in S. uberis-positive group when compared to the control group. In support of our data, a lower DNA methylation level was also reported in blood neutrophils of cows with E. coli-induced mastitis  and mammary epithelial cells challenged with bacterial lipopolysaccharide , as well as in cow ileum tissue positive for mycobacterium avium spp paratuberculosis infection , suggesting that the lower whole genome methylation level could be an epigenetic signature of cow’s response to mastitis and other diseases. As expected, more DMCs were in the CpG context, further supporting the important regulatory roles of CpG methylation during disease processes. The functions of DNA methylation are spatially specific, in that DNA methylation in different genetic features have different effects, such as transcriptional repression at the promoter and exon splicing alteration in exons . It is worthy of note that about half the identified DMCs were located in repeat elements, represented by SINE, LINE and ERV, which is reasonable because repeats account for about half the bovine genome. DNA methylation performs very important roles in the transcriptional silencing of repeats to restrict their genotoxic potentials and thereby contribute to genome stability . It has been suggested that the changes in DNA methylation status in diseases may enable the activation of repeat elements, especially cancers . Thus, the enrichment of DMCs in repeat elements in this study suggests a possible regulatory mechanism of DNA methylation by affecting the activities of repeat elements in response to S. uberis infection. Considering the distribution of DMCs in CGI context, CGI shores harbored more CpG-DMCs than CGI shelves and CGIs. Moreover, it was interesting to observe a downward trend of DNA methylation level for CpG sites in CGI shores. Observation of more hypomethylated DMC sites in this study is opposite to results in humans whereby CGI shores hypermethylation has been widely identified in cancer and tumor development  and also reported to regulate the expression of key genes in human breast cancer [52, 53]. The potential roles of DNA methylation alterations during mastitis are not yet clear needing further exploration to better understand the specific roles of hyper-/hypo- methylation during bovine S. uberis mastitis.
In this study, the DNA methylation changes during S. uberis subclinical mastitis was not only identified at single cytosine sites but also at cytosine regions to avoid the possible technical noise resulting from measuring the methylation levels of single cytosines. Firstly, 316 DMRs were found by scanning the whole genome with fixed-width windows, leading to the identification of important regions with greater methylation differences. However, this method did not consider the interaction between adjacent methylation sites, prompting us to explore additional methods. The DNA methylation changes in response to environmental stressors, especially disease pathogens, are mediated by changes in implicated enzyme activities (DNMT1, DNMT3A/B and TET proteins) . The change in enzyme activities is locally coordinated, leading to similar methylation status of adjacent sites, which contributes to form methylation haplotypes . Therefore, we used the method of MHB (methylation haplotypes block) to investigate the co-methylation status of adjacent CpG sites and to identify biologically relevant linked regions of DNA methylation sites known as MHBs. The average length of MHBs identified in this study (~ 35 bp) was shorter than ~ 58 bp reported for bovine sperm  and ~ 59 bp reported for human primary organs (lung, fat, spleen, liver, embryonic stem cells, etc.) , which may be due to the small sample size of this study and differences in methylation sequencing and analytical methods. A total of 451 dMHBs showed significant differences in the co-methylation status of adjacent sites between S. uberis-positive and control groups, revealing possibly more direct epigenetic changes in response to S. uberis infection. For example, a dMHB on Chr5 (chr5:115877805–115877881) overlapped with SINE2-1 and harbored three CpG sites that were highly correlated with each other (r2 > 0.7). The three CpG sites were all unmethylated in the healthy control group but presented methylation levels of about 40% in S. uberis-positive group. Another dMHB on Chr11 (chr11:18837854–18837864) was very short, being only 10 bp length and harbored 4 fully correlated DMCs (r2 = 1). Meanwhile, dMHB chr5:115877805–115877881 was unmethylated in S. uberis-positive group, but showed mean methylation level of ~ 42.85% in the control group. It is worthy of note that dMHB chr11:18837854–18837864 located in the intron of CRIM1 has been reported to be involved in the regulation of mammary gland morphology  and milk protein concentration . Besides, a CRIM1 variant has been associated with neutropenia during pediatric acute lymphoblastic leukemia , suggesting its possible effects on neutrophil activity, which is important for mastitis pathogenesis. This further highlights the potential of dMHB chr11:18837854–18837864 as a candidate biomarker of S. uberis subclinical mastitis. As shown by these two examples, MHBs represents the co-methylation status of a number of adjacent CpG sites, which could help to reduce the possible technical bias that is the main limitation of quantifying single DMC sites. Although DMR profiling is an improvement over single DMC quantification, it also has the limitation of choosing the length of windows and sliding step during identification. These limitations are overcome with the technique of MHB which relies on the relationship between adjacent DMCs and the bias of specifying sequence length is removed. In addition, MHBs are much shorter than DMR making it easier to profile in a larger population. Therefore, MHBs are relatively more reliable to be used as candidate biomarkers than DMC or DMR.
To further investigate the regulatory roles and mechanisms of DNA methylation changes in response to S. uberis mastitis, the DNA methylation and gene expression data were integrated. Based on MethGET results, a total of 1623 MetGDE genes were identified with significant changes (MethGET p-value < 0.001) in their gene expression levels and/or DNA methylation levels at promoter regions, gene bodies, introns or exons. The DNA methylation alterations of the MetGDE genes possibly regulated the expression changes of corresponding genes, prompting further analyses to understand their potential roles. For instance, the MetGDE gene TTC9 had 19% higher promoter methylation level and about 3.5 times lower expression level in S. uberis-positive group, suggesting its expression was more likely repressed by its increased promoter methylation. In addition, changes in the DNA methylation levels of about half of the MetGDE genes were in the same direction as changes in their gene expression, which is against the general notion that promoter methylation represses gene expression, suggesting that promoter methylation may regulate gene expression by other mechanisms. Interestingly, a bunch of MetGDE genes are well-known immune-related genes, such as 10 genes of the interleukin family (IL6, IL9, IL12B, IL17A, IL17F, IL17RB, IL1RN, IL27RA, IL36B, IL36G) and 24 genes of the solute carrier family (such as SLC2A6, SLC24A1, SLC25A21, etc.). Interleukin genes play important roles in inflammation and the immune system, and some of them, such as IL6 has been found to regulate the immune response to bovine mastitis and also identified as a candidate biomarker of subclinical mastitis [59, 60]. It is worth mentioning that MetGDE gene SLC25A21 harbored 46 DMCs in its gene body and 6 DMCs at its promoter region, and 80% of them were hypo-methylated. SLC25A21 has been reported as differentially expressed in mammary gland tissue following infection by E. coli and S. aureus . The hypomethylation and up-regulated expression of SLC25A21 in S. uberis-positive group in this study highlights that its altered expression during S. uberis mastitis may be regulated by DNA methylation changes.
Functional enrichment revealed enrichment of MetGDE genes in GO terms and KEGG pathways with immune and disease related functions, further supporting the potential regulatory roles of DNA methylation of these genes and their involvement in the immune response during S. uberis mastitis. In particular, a cluster of annotations related to cytokine process and activities were enriched with potentially up-regulated activities (positive z-score), including Cytokine-cytokine receptor interaction, Chemokine activity, Chemokine-mediated signaling pathways, Cellular response to interleukin-1 and Chemokine signaling pathway. Chemokines, such as interleukins, interferons and chemokines, play key roles in the regulation of intensity and duration of inflammatory, and immune responses to mastitis infection by regulating the activities of immune-related cells [14, 15]. The significant enrichment of these cytokine-related pathways strongly suggest that DNA methylation changes are potentially involved in the regulation of cytokine activities by mediating the expression of related genes, and the host immune response to S. uberis infection. Interestingly, a cluster of highly related MetGDE genes with chemokine activities, including CXCL2, CXCL8, CXCL12, CCL3, CCL4, CCL5 and CCL20, were enriched in multiple pathways mentioned above. CXCL8, also known as IL-8, is a pro-inflammatory cytokines which functions to guide neutrophils to the site of infection . CXCL2 and CXCL12, members of the CXC subfamily are expressed at the site of inflammation and serve diverse cellular roles during infection. CCL3, CCL4, CCL5 and CCL20 (C-C chemokine family) play crucial roles in recruiting leukocytes such as T-cells and macrophages to the site of inflammation . These genes are highly related to each other and involved in similar immune processes, and their altered activities have been associated with the host response to mastitis. For example, CXCL2 and CXCL8 have been identified as the most informative genes and as candidate biomarkers of bovine mastitis, suggesting their possible effects on cow's ability to resist mastitis [61, 64–66]. Besides, these genes CCL3, CCL4, CCL5 and CCL20 have also been reported as differentially expressed in response to mastitis by other investigators [67–69]. The CCL5 gene has been found to be up-regulated in bovine mammary epithelial cells stimulated by E. coli, but down-regulated in mammary glands with S. aureus-induced mastitis [70, 71]. Moreover, CCL5 has also been reported to be closely associated with the pathogenesis of chronic mammary inflammation and mastitis [72, 73]. These reports highlight the important association of these genes with mastitis; however, the underlying regulatory mechanisms are not well understood. The DNA methylation changes in these genes during S. uberis infection observed in this study and during mastitis caused by other pathogens [16, 19, 20, 74], suggest DNA methylation changes as one of the underlying regulatory mechanisms. An in vitro challenge of immortalized bovine mammary epithelial cells with peptidoglycan and lipoteichoic acid induced global hypomethylation by regulating DNMT activity and causing inflammation . In agreement, the DNA methylation of S. uberis-positive group was globally lower than of the control group in this study. Furthermore, the expressions of CXCL2, CXCL8, CXCL12, CCL3, CCL4, CCL5 and CCL20 were up-regulated in S. uberis-positive group, while hypomethylation was detected in the promoter and gene body regions of most of them (except CXCL2). This indicates that DNA methylation changes may be one possible mechanism that modulated the differential expression of these key genes leading to the activation of the activities of chemokines and heightened host immune response to S. uberis infection. Another cluster of significantly enriched GO term and KEGG pathways related to diseases and immune response pathways was represented by NF-κB signaling pathway and Antigen processing and presentation pathway. The NF-κB pathway play roles as the upstream signal that controls the transcription of the inflammatory factors mentioned above[75, 76]. Besides, the activity of the Antigen processing and presentation pathway tended to be increased (positive z-score), suggesting that the enriched MetGDE genes (e.g. CD40LG, LOC524810, BOLA-DQA2, LOC100300716 and MGC126945) may be involved in the regulation of the adaptive immune response by affecting antigen processing. For example, the interaction between CD40LG and CD40 is very important for immune responses dependent on T cells, and they have been found to participate in bovine inflammatory responses . LOC524810 (IgM) and LOC100300716 (immunoglobulin heavy variable 4-38-2) are immunoglobulins performing critical functions of recognizing and binding antigens, such as S. uberis. Genetic variations in BOLA-DQA2, a Bovine Leukocyte Antigen (BOLA) class II gene, have been associated with resistance to dairy cow mastitis [78, 79]. MGC126945, an uncharacterized protein, was found as highly variable in bovine respiratory disease .
DNA methylation mediates genomic adaption to environmental influences without changing the underlying DNA sequence and consequently contributing to phenotypic expression, and throwing more light on explaining the “black box” between the phenotype and the genotype . Including epigenetic biomarkers to current livestock breeding programs may improve the prediction accuracy of genetic breeding values and thereby increased genetic gain [81–83]. Furthermore, it could contribute to complement genomic information and provide a better understanding of the factors that shape livestock phenotypes and directional application in breed improvement and management practices . A large body of evidence from human studies reveals DNA methylation changes as promising biomarkers for disease diagnosis or risk assessment [85, 86]. Therefore, this study utilized the discriminant analysis of DIABLO that allows integration of DNA methylation and gene expression data , to identify candidate biomarkers (DE genes, DMCs and dMHBs) of S. uberis mastitis. The candidate biomarkers delineated clearly the two groups, suggesting their ability to discriminate S. uberis-positive cows from the control group. The candidate biomarkers consisted of genetic biomarkers (6 DE genes), and epigenetic biomarkers (15 DMCs and 5 dMHBs). The epigenetic biomarkers were highly correlated with genetic biomarkers (cor > |0.80|), suggesting that DNA methylation changes (epigenetic biomarkers) are more likely involved in the expression changes of the genetic biomarkers. Looking at the epigenetic candidates, all 5 dMHBs showed higher methylation levels in S. uberis-positive group. The most important dMHBs (chr5:32984279:32984330 and chr24:7353387:7353442) and four DMC candidates were overlapped with repeat elements, suggesting that epigenetic candidate biomarkers may be involved in the regulation of the host response to S. uberis infection by mediating transposition activities. For instance, dMHB chr5:32984279:32984330 overlapped with repeat element ALTR2C_BT, belonging to ERV1 family. Endogenous retroviruses (ERV) are ubiquitous in mammalian genomes, and some have been found to play roles as enhancers for immune-related genes in both human and mice . In addition, dMHB chr24:7353387:7353442 overlapped with transposon L1MA9, a member of L1 family, as well as CD226 gene which functions by mediating cytotoxicity and is associated with immunologic diseases [88, 89]. The function of repeat elements in bovine diseases, including mastitis is ill-defined. Our data suggest involvement in epigenetic processes, which requires further exploration. Among the candidate DE genes, SLC40A1 has the greatest loading weight and its correlation with epigenetic biomarkers was extremely high (cor > |0.99|). SLC40A1 was significantly down-regulated in S. uberis-positive group and harbored three hypermethylated DMCs in its gene body, suggesting that gene body methylation may be involved in the regulation of its expression. On the contrary, the SLC40A1 promoter was identified as hypomethylated with associated up-regulated mRNA expression in blood neutrophils, and consequently considered a candidate biomarker for improving resistance to bovine mastitis induced by E. coli, . These contrasting effects of SLC40A1 DNA methylation on its gene expression may be caused by the differences in tissue or the pathogenic bacteria between the two studies. Differential immune responses and related genetic or microRNA regulation has been reported during infections by Staphylococcus aureus and E. coli [65, 70, 90]. This suggests that the DNA methylation status and expression of SLC40A1 may change in different directions during mastitis caused by S. uberis compared with E. coli.