Identication of regulatory networks and hub genes controlling mammary gland development and lactation in dairy goats during the late lactation, dry period, and late gestation stages

From the late lactation to late gestation stages, the mammary gland tissue of goats undergoes a process from involution to remodeling and then to high differentiation of mammary gland tissue. From the perspective of lactation, this is a continuous development process of the goat mammary gland from the termination of lactation to the restoration of lactation. We performed transcriptome sequencing on goat mammary gland tissues at three mammary gland developmental stages to screen for differentially expressed genes that affect mammary gland development and the physiological process of lactation and mapped their expression proles in three stages. The objective of this study is to reveal the expression characteristics of these genes and their potential function during mammary gland development and lactation.


Abstract
Background From the late lactation to late gestation stages, the mammary gland tissue of goats undergoes a process from involution to remodeling and then to high differentiation of mammary gland tissue. From the perspective of lactation, this is a continuous development process of the goat mammary gland from the termination of lactation to the restoration of lactation. We performed transcriptome sequencing on goat mammary gland tissues at three mammary gland developmental stages to screen for differentially expressed genes that affect mammary gland development and the physiological process of lactation and mapped their expression pro les in three stages. The objective of this study is to reveal the expression characteristics of these genes and their potential function during mammary gland development and lactation.

Results
We identi ed 1,381 differentially expressed genes in the mammary gland during three stages and found that the expression level of genes encoding casein, such as alpha-s1-casein (CSN1S1), alpha-s2-casein (CSN1S2), beta-casein (CSN2), and kappa-casein (CSN3), and alpha-lactalbumin (LALBA) were higher in mammary gland tissues during the late lactation stage and late gestation stage than those during the dry period. In addition, we constructed six functional networks related to differentially expressed genes and found that these genes are closely related to mammary gland growth and development, apoptosis, immunity, substance transport, biosynthesis, and metabolism. Finally, we identi ed 35 differentially expressed transcription factors, which were classi ed into 16 families, and predicted that transcription factors of the basic leucine zipper domain (bZIP) family and basic helix-loop-helix (bHLH) family regulated the expression levels of genes related to mammary gland development and lactation.

Conclusions
Among the late lactation, dry period, and late gestation stages, there are differences in the expression levels of genes related to mammary gland growth and development, apoptosis, immunity, basic substance transport, biosynthesis, and metabolism in mammary gland tissues. Some genes in the same family or with similar functions have similar expression patterns. These differentially expressed genes or transcription factors work synergistically to participate in the regulation of mammary gland development and the physiological process of lactation.

Background
Mammary gland development and lactogenesis and the factors regulating these two processes have been topics of interest for quite some time. From the beginning, people knew the structure of the mammary gland through anatomical methods [1][2][3]. Later, various molecular techniques were used to discover molecules related to mammary gland development and lactation [4,5]. In recent years, the development and popularization of high-throughput sequencing technologies, such as RNA sequencing (RNA-Seq), have provided more powerful means for mammary gland research [6]. A number of molecules related to mammary gland growth, differentiation, metabolism, lactation, and diseases have been screened and identi ed. For example, Scheele et al. identi ed the molecular characteristics of mammary gland stem cells during branching morphogenesis and found that the transcription of mammary gland stem cells in the lumen and basal position might be different [7]. Kosciuczuk et al. discovered the differential expression of the mammary gland immune response-related genes FOS and tumor necrosis factor (TNF) through transcriptome analysis of mammary parenchyma tissues infected with Staphylococcus aureus [8]. Wagner et al. constructed single-cell atlases of mammary gland immunology and tumor ecosystems using single-cell sequencing technology [9]. In addition, in previous studies conducted in our laboratory, Laoshan dairy goats were used as a mammary gland research model to conduct studies on gene and noncoding molecules in mammary gland tissues at the early (20 d postpartum), peak (90 d postpartum) and late stages (210 d postpartum) of lactation. We mapped the expression pro les related to the regulation of mammary gland development and lactation at three stages and constructed a potential regulatory network [10][11][12]. Although scientists have made some progress in mammary gland research, for the complex physiological processes of mammary gland development and lactation, these ndings are just the tip of the iceberg, requiring further exploration.
The physiological structure and function of the mammary gland undergo corresponding changes at different developmental stages [13]. After a long period of lactation, mammary gland tissues begin to involute, and mammary gland cells undergo apoptosis. The number of mammary acinar and duct branches decreases [14]. Moreover, during the late lactation stage, the milk synthesis and secretion capacities and cell viability of epithelial cells signi cantly decline [15]. Subsequently, the mammary gland of goats stops secreting milk and enters the dry period. Then, the mammary gland tissues undergo remodeling; on the one hand, the lactating epithelial cells in the involution are cleared; on the other hand, the mammary gland cells are regenerated, proliferating and differentiating to form new mammary gland branches and acinar structures, preparing for the next lactation [16]. In late gestation, the structural differentiation of mammary gland tissues is more mature, with more acinus and duct branch numbers.
Moreover, mammary epithelial cells already begin to synthesize and secrete milk, and milk is stored in the acinus [17,18].
Based on the signi cant changes in the structural and physiological functions of mammary gland tissue during these three stages, we wanted to understand the genes that might play a potential regulatory role in mammary gland development and lactation during these three consecutive mammary gland tissue development stages. Therefore, we performed transcriptome sequencing on the mammary gland tissues of dairy goats at the late lactation (LL), dry period (DP), and late gestation (LG) stages and used bioinformatics analysis to plot the expression pro les of differentially expressed genes (DEGs) at three developmental stages to screen regulatory factors related to mammary gland growth, differentiation, lactation and immunity. We hope this study will further enrich the theoretical knowledge of lactation and mammary gland development and provide a theoretical basis for the rational breeding of dairy goats.

Basic statistical analysis of the sequencing results
According to the statistical analysis of the sequencing results (Table 1), 147,531,532, 151,341,075, and 161,094,090 clean reads were obtained for mammary gland tissues from the three stages (LL, DP, and LG), respectively; these clean reads were used for downstream and reference genome alignment, and the alignment ratio of the reads was above 95%. Calculation of the expression levels showed 18,472, 19,941, and 19,184 expressed genes during the LL, DP, and LG stages, respectively. We then ltered out lowexpression genes and obtained 13,598, 14,357, and 14,221 genes for the LL, DP, and LG stages, respectively, for expression abundance and differential expression analyses. Analysis of gene abundance and identi cation of DEGs Analysis of gene expression abundance showed that the LG stage had the largest number of highly abundant genes (129 genes), followed by the DP (104 genes) and LL (53 genes) stages (Table 2). Among the genes with moderate expression levels, the DP stage had the largest number of genes (5104 genes), and the LL stage had the least number of genes (2053 genes). The numbers of genes with low expression levels or ultralow expression levels in the LL stage were 10,122 and 1,370, respectively, much higher than those in the DP and LG stages. Principal component analysis and cluster analysis showed that the samples from the same developmental stage clustered, and correlation heat maps showed that the correlation coe cient for samples from the same developmental stage was greater than or equal to 0.92, higher than the correlation coe cient for samples from different stages ( Fig. 1 A-B). This indicates that samples from the same developmental stage have a high correlation and good repeatability. Subsequently, a total of 1,381 DEGs were identi ed using differential expression analysis among samples (Fig. 2). In the LL vs DP comparison, 627 genes were upregulated, and 158 genes were downregulated ( Fig. 2 A-E). Among the upregulated genes, butyrophilin subfamily 1 member A1 (BTN1A1), alpha-lactalbumin (LALBA), alpha-s1-casein (CSN1S1), LOC102178810, alpha-s2-casein (CSN1S2), beta-casein (CSN2), kappa-casein (CSN3), lactoperoxidase (LPO), LOC102169973 and LOC102177798 had higher fold changes in gene expression, and log2-fold changes in these genes were greater than 10. For the downregulated genes, the absolute values of log2-fold changes in transthyretin (TTR), peptidase inhibitor 16 (PI16), and transient receptor potential cation channel subfamily v member 6 (TRPV6) were greater than four. In the DP vs LG comparison, there were 322 upregulated genes and 495 downregulated genes. In the LL vs LG comparison, there were 413 upregulated genes and 132 downregulated genes. Genes with higher fold changes above two are listed in Figure 2b and 2c. To further understand the expression change patterns of DEGs, we constructed global gene expression heat maps of 1381 genes and identi ed the top 20 DEGs in the three developmental stages (Fig. 2 F-I). In the LL stage, the expression patterns of CSN3, CSN1S1, CSN1S2, and LALBA, which are related to lactoproteinencoding genes, were similar, and the expression levels were signi cantly lower in the DP stage than those in the LL and LG stages. In the DP stage, the expression patterns of ribosomal protein L23 (RPL23), ribosomal protein S3A (RPS3A), RPL19, RPL13A, RPLP0, and RPS2, which are related to the encoding of ribosomal protein subunits, were similar, and the expression levels of these genes were lowest in the LL stage and the highest LG stage. In the LG stage, we found the speci c high expression of the immunoregulatory genes progestogen-associated endometrial protein (PAEP) and lactotransferrin (LTF).
Please refer to Additional le 1 for speci c information of differential expression analysis. GO annotation results showed that apoptosis-related GO terms were identi ed in the LL vs DP comparison ( Fig.3 A-C), in which eight genes were related to intrinsic apoptotic signaling pathways in response to endoplasmic reticulum stress, 20 genes were related to intrinsic apoptotic signaling pathways, and 24 genes were related to regulation of the apoptotic signaling pathway. Among the biosynthesis-related GO terms, nine genes were associated with the monosaccharide biosynthetic process, 12 genes were associated with the cellular amino acid biosynthetic process, and 17 genes were related to fatty acid biosynthetic process. DEGs were also annotated to GO terms associated with growth and development; 14 genes were associated with mammary gland development, and 29 genes were associated with cell growth. In addition, GO terms related to the metabolism of substances (fatty acid metabolic process, regulation of lipid metabolic process, and cellular amino acid metabolic process) and substance transport (protein transmembrane transport, amino acid transmembrane transport, fatty acid transport, glucose transmembrane transport, and lipid transport) were also signi cantly enriched. In the DP vs LG comparison, more speci c GO terms related to mammary gland growth and development were enriched, including branching involved in mammary gland duct morphogenesis, mammary gland epithelial cell proliferation, mammary gland duct morphogenesis, and mammary gland epithelium development. In the LL vs LG comparison, mammary gland immunity-related GO terms including neutrophil activation involved in immune response, and neutrophil-mediated immunity were also signi cantly enriched. Please refer to Additional le 2 for speci c information of GO analysis. KEGG analysis identi ed important pathways ( Fig. 4 A-C), such as apoptosis, biosynthesis of amino acids, and fatty acid biosynthesis. In addition, the peroxisome proliferator-activated receptor (PPAR) signaling pathway, AMP-activated protein kinase (AMPK) signaling pathway, extracellular matrix (ECM)-receptor interaction and other pathways that are potentially related to mammary gland development were also identi ed. Please refer to Additional le 3 for speci c information of KEGG pathway analysis.

Functional network analysis of DEGs
Combining the GO annotation and KEGG analysis results, we identi ed 40 DEGs and 8 GO terms to construct a development/growth network ( Fig. 5 A). Among them, the number of genes related to mammary gland development was the largest, with 25 genes. The expression patterns of the 40 DEGs were divided into three categories: in cluster 1, the expression levels of PI16, insulin-like growth factor binding protein 4 (IGFBP4), and low density lipoprotein receptor-related protein 1 (LRP1) in the LL stage were signi cantly lower than those in the DP and LG stages. Fibroblast growth factor 7 (FGF7) and FGF2 are members of the broblast growth factor family and have extensive mitogenic and cell survival functions [19]. Both IGFBP4 and IGFBP3 are members of the insulin-like growth factor binding protein family and play regulatory roles in cell growth by binding to IGF [20]. The highly expressed IGFBP4 and IGFBP3 in LG may be closely associated with the proliferation and differentiation of mammary gland cells. In cluster 2, cluster of differentiation 109 (CD109), regulator of G protein signaling 4 (RGS4), IGFBP5, B cell lymphoma 2 like 11 (BCL2L11), amphiregulin (AREG), and sulfatase 2 (SULF2) were highly expressed in the LL and DP stages but were expressed at lower levels in the LG stage. In our previous study, BCL2L11 was shown to have potential function in promoting the apoptosis of mammary gland epithelial cells [17]. In cluster 3, Wnt family member 5A (WNT5A), reticulon 4 (RTN4), cysteine dioxygenase type 1 (CDO1), and glycerol-3-phosphate acyltransferase 4 (GPAT4) were highly expressed in the LL and LG stages and were expressed at low levels in the DP stage. CSN2 and CSN3 are important genes encoding casein [21], and the low expression level of these two genes during the DP stage may be closely related to no milk secretion in the DP stage. Similarly, we identi ed 43 genes and ve apoptosisrelated pathways to construct an apoptosis network ( Fig. 5 B), in which regulation of the apoptotic signaling pathway had the largest number of genes (34 genes). For intrinsic apoptotic signaling pathways, there were 31 genes. For extrinsic apoptotic signaling pathways, there were 13 genes. The expression patterns of the 43 apoptotic genes were also divided into three categories. In cluster 3, the gene expression levels of B cell receptor associated protein 31 (BCAP31), CD24, tumor protein D52 like 1 (TPD52L1), nuclear protein 1 (NUPR1), and caspase 8 (CASP8) were higher in the LL stage than in the DP and LG stages. CASP8, CASP9, and CASP3 encode members of the cysteine-aspartic protease family and play central roles in the execution of cell apoptosis [22], and their expression patterns may be related to the apoptosis of mammary gland cells during the LL stage. In addition, we also obtained 8 GO terms and 39 genes associated with mammary gland immunity in dairy goats and constructed an immunity network ( Fig. 5 C). The DP stage is not only a period for improving the nutrition status of goats and enhancing the constitution of the body but is also a period of susceptibility to mammary gland disease [23]. In cluster 2, we found that complement component 5a receptor 1 (C5AR1), S100 calcium-binding protein A9 (S100A9), complement component 7 (C7), complement component 1s (C1S), serpin family G member 1 (SERPING1), complement factor B (CFB), C2 and other genes were speci cally highly expressed in the DP stage. Among them, complement C1q subcomponent subunit A (C1QA), C1QB, C1QC, C1S, C2, and C7 all encode members related to the complement components of the immune system, serve as an important part of the innate immune system [24], and may participate in mammary gland immunity and defense processes in the DP stage. In addition, substance transport, synthesis and metabolism are the most basic physiological processes in the mammary gland and are crucial for mammary gland development and lactation. The substance transport network consists of eight GO terms and 30 genes ( Fig. 6 A). In cluster 2, solute carrier family 6 member 14 (SLC6A14), SLC1A5, SLC2A10, SLC2A9, and SLC27A3, which encode solute carrier family members, had higher expression levels in the LL and LG stages than in the DP stage.
Transcription factor analysis A total of 35 TFs were identi ed from a total of 1,381 DEGs ( Fig. 7 A-D), of which 15 TFs were upregulated in the LL vs DP comparison, 22 TFs were identi ed in the DP vs LG comparison (nine were upregulated, and 13 were downregulated), and 13 TFs were identi ed in the LL vs LG comparison (nine were upregulated, and four were downregulated). They were divided into 16 families, of which family information for 4 TFs was not clear. The bZIP family had the largest number of TFs, including 5 TFs, i.e., ATF4, CCAAT enhancer binding protein gamma (CEBPG), FOS like 2 (FOSL2), JunB proto-oncogene, AP-1 transcription factor subunit (JUNB), nuclear factor, interleukin 3 regulated (NFIL3). The expression patterns of CEBPG and ATF4 were similar according to the expression heat maps, of which the expression levels were higher in the LL stage that in the DP and LG stages, while the expression patterns of NFIL3, JUNB and FOSL2 were similar, of which the expression levels were higher in the DP stage than in the LL and LG stages. The bHLH family contains 4 TFs, i.e., aryl hydrocarbon receptor nuclear translocator (ARNT), basic helix-loop-helix family member A15 (BHLHA15), BHLHE40, and microphthalmia-associated transcription factor (MITF). The expression patterns of MITF, BHLHE40, and ARNT were similar, of which the expression levels were all higher during the DP stage. In addition, estrogen receptor 1 (ESR1), progesterone receptor (PGR), and estrogen-related receptor alpha (ESRRA) all belong to the ESR-like family. They also had similar expression patterns, with high expression in the LG stage and relatively low expression in the LL stage. Similar expression patterns of most TFs in the same family also appear in the homeobox and zf-C2H2 families. Please refer to Additional le 4 for speci c information of transcription factors analysis.

Real-Time Quantitative Reverse Transcription PCR (qRT-PCR)
From the histogram of transcriptome sequencing results ( Fig. 7 F), the 15 randomly selected DEGs include genes with relatively high expression levels, such as receptor for activated C kinase 1 (RACK1), GPAT4, and CSN2, as well as genes with relatively low expression levels, such as PCK2, MUC1, and SESN2. qRT-PCR detection showed that the expression patterns of the 15 genes obtained from qRT-PCR were consistent with those obtained from the transcriptome sequencing results (Fig. 7 E-G). Correlation analysis was performed on the RNA-Seq of the 15 genes during three developmental stages and the log2fold change obtained from qRT-PCR, and the correlation coe cient (cor) = 0.91 (p-value < 2.2e-16), indicating a high correlation of RNA-Seq with qRT-PCR detection, which further veri ed the accuracy of the sequencing results. All primer sequences are provided in the Additional le 5.

Discussion
In this study, RNA-Seq and bioinformatics analysis were performed on the mammary gland tissues of dairy goats in the LL, DP, and LG stages, and the gene expression pro les related to mammary gland development and lactation were mapped to construct mammary gland growth, proliferation, apoptosis, immunity, substance transport, synthesis, and metabolic-related networks. Prior to this study, studies have reported the transcriptome analysis of mammary gland tissues during the lactating and DP stages in pigs [18], sheep [25], and cattle [26]. The differences between this study and the above studies are that we selected three consecutive periods of mammary gland development. From the perspective of lactation, this is a process from the gradual termination of milk secretion to the absence of milk secretion and then to the restart of lactation. From the perspective of tissue structure, this is a process from mammary gland tissue involution to tissue remodeling, cell proliferation, and differentiation to form new mammary gland duct branches and acinar structure. Data analysis showed that a total of 1,381 DEGs were identi ed in the three mammary gland developmental stages, indicating that gene expression was indeed different among the three stages (Fig. 2). Perhaps the differential expression of these genes leads to differences in the morphology, structure and function of the mammary gland during the three stages. We identi ed the 20 DEGs with the highest expression in the mammary gland during the three stages, and these DEGs included CSN1S1 and CSN1S2, which encode alpha-casein, CSN2, which encodes beta-casein, CSN3, which encodes kappa-casein, and LALBA, which encodes alpha-lactalbumin. The expression levels of these gens were lower in the DP stage than in the LL and LG stages. The expression patterns of these genes found in this study were consistent with those of previous studies. [26,27], and their high expression levels may be closely related to milk synthesis during the LL and LG stages. FGF2 and FGF11, as members of the broblast growth factor family, exhibited similar expression patterns, and their expression levels were lower in the LL stage than in the DP and LG stages. It has been reported that FGF2 promotes the proliferation of mammary epithelial cells, which is conducive to mammary gland remodeling and maintaining milk secretion [28]. RPS3A, RPS2, RPL19, RPLP0, RPL13A, RPL23, and RPS8 encode ribosomal proteins. Similarly, they also exhibited similar expression patterns, namely, expression levels during the LL stage were lower than those in the DP and LG stages. Therefore, we hypothesize that genes in the same family or genes with similar functions often have similar expression pattern in the mammary gland during different developmental stages. Other studies reported that in some cases, some gene family members had the same (or almost the same) sequence. These families allow large amounts of genes to be expressed in a short time as needed [29]. However, we used a limited number of goat mammary gland samples, and studies need to be conducted to verify this hypothesis.
The physiological structure of the goat mammary gland changes signi cantly during the three developmental stages. These changes are closely related to the speci c expression of genes during different developmental stages [30]. According to the development/growth network (Fig. 5A), IGFBP5, a member of the insulin-like growth factor family, was associated with mammary gland development and morphogenesis and was highly expressed in the LL stage. It has been reported that IGFBP5 can accelerate the apoptosis of mammary epithelial cells and that the overexpression of IGFBP5 in vivo can lead to impaired mammary gland development [31]. Different from IGFBP5, IGFBP4 and IGFBP3 were expressed at low levels in the LL stage and at higher levels in the DP and LG stages. Our ndings are consistent with those by Alla et al. in a study of the mammary gland tissues of adolescent, gestational, lactating and degenerative mice [32]. This again con rmed that insulin-like growth factor binding protein family members may have a potential regulatory relationship with mammary gland development. In addition, WNT5A and RTN4 genes involved in mammary epithelial cell proliferation had similar expression patterns, and all were expressed at low levels in the DP stage and high levels in the LG stage. It has been reported that WNT5A has an inhibitory effect on mammary gland development [33] and that RTN4 can inhibit hepatocellular carcinoma cell (HCC) proliferation and promote HCC apoptosis [34], which may counteract cell proliferation in LG stage. However, mitogen-activated protein kinase 1 (MAPK1), ESR1, cyclin D1 (CCND1), and mediator complex subunit 1 (MED1) were expressed at low levels in the LL stage and at high levels in the DP and LG stages. All these genes could promote the proliferation of mammary gland cells [35][36][37][38]. Therefore, RTN4 and WNT5A expression in the LG stage may be related to the regulation of mammary gland cell proliferation by other genes and to the differences in roles of these two genes in different tissues and cell types [35]. Transforming growth factor beta 3 (TGFB3) and transforming growth factor beta receptor 2 (TGFBR2) are both associated with the TGF-β signaling pathway, have become important molecular markers of breast cancer, and play important roles in the TGF-β pathway for mammary gland morphogenesis and lactation [39]. In addition, the elevated levels of prolactin and progesterone during pregnancy are conducive to mammary gland tissue remodeling and the formation of mammary gland lobe and duct branch structures [40]. However, we found that PGR and PRLR, as receptors for progesterone and prolactin, respectively, were highly expressed in mammary gland tissues during the LG stage, further demonstrating their promoting effects on mammary gland development, differentiation and lactation during the LG stage.
Apoptosis is essential to maintain the stability of the internal environment of the mammary gland, cell renewal and normal cell functions and activities [41]. According to the apoptosis network (Fig. 5B), members of the caspase family, CASP8, CASP9, and CASP3, are associated with extrinsic apoptotic signaling pathways, and their expression levels during the LL stage were much higher than those during the DP and LG stages. This is consistent with previous reports; that is, when the mammary gland degenerates, caspases are activated, and the expression levels of caspases are upregulated [42]. CASP8 and FADD like apoptosis regulator (CFLAR), galectin 3 (LGALS3), interferon alpha-inducible protein 6 (IFI6), lamin A/C (LMNA), janus kinase 2 (JAK2), TNF receptor superfamily member 1B (TNFRSF1B), secreted frizzled-related protein 1 (SFRP1), and protein tyrosine phosphatase receptor type C (PTPRC) are also associated with extrinsic apoptotic signaling pathways, and they are highly expressed in the DP stage. Among them, IFI6 encodes anti-apoptotic proteins that reside on the outer mitochondrial membrane, and interrupting the mitochondrial function of IFI6 may prevent breast cancer metastasis [43]. The constitutive activation of JAK2 in mammary epithelial cells can enhance signal transducer and activator of transcription 5 (Stat5) signal transduction, promote the generation of mammary acinus, inhibit apoptosis, and promote the develop of breast tumors [44]. SFRP1-encoded protein can block the Wnt signaling pathway, and low expression of SFRP1 can promote mammary epithelial cell proliferation and the branching of mammary ducts [45]. Based on existing reports and high expression levels in the DP stage, these three genes might be involved in the regulation of cell apoptosis in mammary gland tissue during remodeling in the DP stage. In addition, CFLAR-encoded proteins are regulators of cell apoptosis and are structurally similar to caspase-8.
LGALS3 is a member of the galectin family, which encodes carbohydrate binding proteins, and members of this family have a nity for β-galactoside [46]. LMNA, TNFRSF1B and PPTRC are all associated with cell differentiation [47][48][49]. However, their role in the mammary gland is still unclear; therefore, further veri cation of their speci c function in mammary gland development is needed. RACK1 is not only associated with the intrinsic apoptotic signaling pathway, but its expression level is in the top 20 during the DP and LG stages. The expression level of RACK1 in the LL stage was the lowest, while it was the highest in the LG stage. The scaffold proteins encoded by RACK1 can participate in the recruitment and assembly of a variety of signaling molecules, including protein kinase C [50], phosphodiesterases [51], Src tyrosine kinase [52], and insulin and insulin-like growth factors [53]. Therefore, RACK1 may participate in the apoptosis of mammary gland cells by regulating a variety of signaling molecules.
Immunomodulation is essential for mammary gland health and is the basis and guarantee for its normal function. On the one hand, the milk secreted by the mammary gland provides nutrition and immune protection for newborn pups; on the other hand, the mammary gland involution and DP stages are not only periods for mammary gland repair and renewal but are also critical periods for defending against various mammary diseases [54]. According to the immunity network (Fig. 5C), we found that genes associated with the regulation of complement activation, such as C5AR1, C7, C1S, SERPING1, CFB, C2, C1R, C1QB, C1QA, C1QC, and CFI, had similar expression patterns; they were all highly expressed in the DP stage. These results suggested that these genes might play a potential regulatory role in mammary gland immunity during the DP stage. C2, C1R, C1S, C1QA, C1QB, and C1QC are all associated with the classical immune pathway [55], have antibacterial functions, regulate immune adhesion, mediate in ammation, and play important roles in inherent immunity against microorganisms [56]. In addition, neutrophil-mediated immunity is also associated with innate immunity. lipocalin 2 (LCN2) was highly expressed in the LL stage. It is known that protein encoded by LCN2 is neutrophilic gelatinase-associated lipocalin, which inhibits the growth of bacteria through the chelation of iron-containing carriers, thereby playing a role in innate immunity [57]. In addition, C5AR1, S100A9, bone marrow stromal cell antigen 1 (BST1), brinogen like 2 (FGL2), PTPRC, cathepsin H (CTSH), and tubulin beta class I (TUBB) were highly expressed in the DP stage. C5AR1 encodes a receptor for the activated fragment C5a generated after the activation of complement and can be used as a chemokine to induce neutrophil phagocytosis to pathogens [58]. S100A9 encodes calcium and zinc binding proteins and plays an important role in the regulation of in ammatory processes and the immune response. It can induce neutrophil chemotaxis and adhesion and can enhance the anti-bactericidal activity of neutrophils through promoting phagocytosis by the activation of spleen associated tyrosine kinase (SYK), PI3K/AKT and extracellular signal-regulated kinase 1/2 (ERK1/2) [59]. BST1, FGL2, PTPRC, CTSH, and TUBB can also induce neutrophils to participate in the ght against pathogens, thus playing a potential protective role in mammary gland health during the DP stage. All three genes, peptidoglycan recognition protein 1 (PGLYRP1), beta-2-microglobulin (B2M), and LTF, were highly expressed in the LG stage. LTF (lactoferrin) is a transferrin protein that transfers iron to cells and controls the levels of free iron in blood and external secretions. It is widely present in the milk of humans and other mammals and has antibacterial activity. It is an important part of the nonspeci c immune system [60]. In addition, humoral immune responses, B cell-mediated immunity, and T cellmediated immunity also play important roles in the mammary gland. It is known that the protein encoded by JCHAIN can connect two monomeric units of IgM or IgA and induces a larger IgA polymer. This protein also helps to bind these immunoglobulins with secretory components to resist pathogens [61]. Both CD74 and CD81 are associated with B cell-mediated immunity, and CD74 is highly expressed in the DP stage, while CD81 is highly expressed in the LG stage. In summary, mammary gland immunity is associated with both the innate immune system and the speci c immune system. The collaboration of these genes in the immune system during the three stages allows mammary gland immunity to proceed smoothly, thereby protecting mammary gland health.
The synthesis of proteins, lipids and lactose in milk begins with the transport of basic substances, such as amino acids, fatty acids and glucose, in plasma [62]. It can be seen from the transport network (Fig. 6A) that the members of the solute carrier family that were differentially expressed during the three stages were involved in fatty acid (SLC27A1, SLC27A3, and SLC27A4) amino acid (SLC1A5, SLC6A8, SLC6A14, and SLC25A29), and glucose (SLC2A4, SLC2A9, SLC2A10, and SLC2A8) transport. The members of these solute carrier families were highly expressed in the LL and LG stages, while low expression was observed in the DP stage. Compared with the gestation stage, the mRNA expression levels of SLC27A3 and SLC27A4 were upregulated 37-fold and 1.4-fold, respectively, during the lactation period [63]. In addition, SLC1A5, SLC2A4, and SLC2A8 were expressed at higher levels during the lactation stage than during the DP stage [26]. This indicates that the members of the solute carrier family are very important for substance transport to the mammary gland during the lactation stage, and their high expression during lactation is closely related to the higher lactation yield requirement. In addition, FABP3, PLIN2, MFSD2A, CD36 and other genes are also involved in fatty acid transport in the mammary gland and are highly expressed during the lactation period. By using proteomics, Zheng et al. found that the expression level of PLIN2 in the LL stage was lower than that in the peak period, which had a potential impact on milk fat synthesis in the mammary gland [64]. CD36 and FABP3 were signi cantly upregulated in the lactation period [65], and FABP3 is one of the most abundant isoforms in bovine mammary gland tissues; FABP3 expression is affected by the lactation period [66]. Proteins encoded by MFSD2A can mediate lipid transport, participate in the establishment of the blood-brain barrier and are required for brain growth and function [67]. In summary, these four genes might cooperate to participate in fatty acid uptake and transport in the mammary gland. In addition, we also found that ADP ribosylation factor like GTPase 6 interacting protein 1 (ARL6IP1) may be involved in amino acid transport; however, its expression pattern is different from that of solute carrier family members. ARL6IP1 can regulate amino acid transport and negatively regulate cell apoptosis by regulating the activity of caspase-9 (CASP9) to protect cell survival [68]. Therefore, ARL6IP1 is highly expressed in the LG stage and has low expression levels in the LL and DP stages. It may be involved in the transportation of amino acids in the LG stage; on the other hand, it may promote the proliferation and differentiation of mammary gland cells in the LG stage. The above data related to substance transport in the mammary gland indicate that during lactation, at different periods, the mammary gland has different demands for glucose, amino acids and fatty acids, and that the RNA expression of the related three types of transport proteins will also undergo proper adjustment and change, adapting to the lactation requirement of the mammary gland. The entire transport process is complex, which may require the coordination of the proteins encoded by multiple genes to achieve organized substance transport in the mammary gland.
The biosynthesis and metabolism of substances are the basic physiological processes of mammary gland development and lactation [69]. In addition to the genes encoding the main proteins in milk, such as casein (CSN1S1, CSN1S2, CSN2, and CSN3) and lactalbumin (LALBA), according to the biosynthesis network (Fig. 6B), a total of 12 genes (GLUL, OAT, CTH, BCAT2, CDO1, PSAT1, GPT2, PYCR1, glutaminase (GLS), phosphoglycerate dehydrogenase (PHGDH), phosphoserine phosphatase (PSPH), and asparagine synthetase (ASNS)) associated with cellular amino acid biosynthetic processes were identi ed. CDO1, PYCR1, PSAT1, BCAT2, ASNS and PSPH all had similar expression patterns and were highly expressed in the LL stage. CDO1 is known to be a key regulator of cysteine concentrations in cells and is identi ed as a candidate gene affecting milk protein traits [70]. PYCR1 encodes an enzyme that catalyzes the conversion of NAD(P)H-dependent pyrroline 5-carboxylate to proline [71]. In addition, proteins encoded by PSAT1 and PSPH are involved in serine synthesis [72]. BCAT2 encodes a branched-chain aminotransferase, and the encoded protein forms a dimer. This dimer catalyzes the rst step in the production of branched-chain amino acids, such as leucine, isoleucine and valine. The abnormal expression of BCAT2 also affects cell proliferation and differentiation [73]. ASNS-encoded proteins are involved in the synthesis of asparagine [74]. The expression patterns of PYCR1, PSAT1, and ASNS were consistent with previous reports that the expression levels of these three genes during the lactation period are higher than those during the DP stage [26]. OAT, GPT2, PHGDH, and CTH are also involved in amino acid biosynthesis, and their expression levels were higher in the LG stage. These genes play important regulatory roles in amino acid biosynthesis in the mammary gland during the lactation period and provide enough amino acids in the mammary gland to achieve protein synthesis. The mammary gland is not only a place for protein synthesis but also an important place for lipid synthesis. It is known that the mammary gland of a lactating mouse synthesizes and secretes milk fat equivalent to its entire body weight in a 20-day lactation cycle, making the mammary gland one of the most active known lipid synthesis organs [62]. According to the biosynthesis network, we identi ed 12 genes associated with fatty acid synthesis. These include genes associated with fatty acid uptake in blood (LPL), the synthesis of long-chain fatty acids (TECR and LPGAT1) and short-chain fatty acids (ACSS1 and ACPS2), de novo fatty acid biosynthesis (ACACA and FASN), fatty acid desaturation (FADS1 and SCD), transcriptional regulation (INSIG1 and CD74), ketone body utilization (PDK4), and redox (GPX4). It has been reported that LPL, ACSS1, ACSS2, FASN, FADS1, and INSIG1 have similar expression patterns, all of which are highly expressed in the peak lactation stage, and the expression pattern of LPL in the bovine mammary gland is very similar to the lactation curve pattern, which may indicate that this gene plays an important role in maintaining milk synthesis [75]. SCD is a core enzyme in lipid metabolism and is the rate-limiting enzyme in the catalyzed synthesis of monounsaturated fatty acids from saturated fatty acid [76]. These lipid metabolism-related genes show different expression patterns during the three developmental stages, which are mutually adaptive to the fat synthesis function in different developmental stages. In addition, genes involved in carbohydrate synthesis and metabolism were also signi cantly enriched. GCK, ATF4, PGM1, and KAT2B are related to gluconeogenesis and glucose metabolic processes. GCK encodes a hexokinase family member, and hexokinase phosphorylates glucose to produce 6-phosphate glucose, which is the rst step in most glucose metabolic pathways (Fig. 6C). The enzyme encoded by PGM1 is involved in the decomposition and synthesis of glucose. GCK and PGM1 are known to have higher expression levels during lactation than during the DP stage [77,78]. The expression pro les of ATF4 and KAT2B in the mammary gland during different lactation stages were mapped for the rst time. In summary, genes related to the biosynthesis and metabolism of basic substances such as glucose, amino acids and fatty acids in the mammary gland were screened using biosynthesis and metabolism networks, and their expression pro les in the mammary gland during three development stages were revealed. This study provides a reference for studying their roles in mammary gland development and lactation function in the future.
In the mammary gland, TFs interact with speci c gene sequences (open or closed), and coordinated gene expression occurs at the correct time and in appropriate numbers. [79]. During mammary gland development and the lactation cycle, many site-speci c TFs are essential for the proper growth, branching and involution of the mammary gland [80]. It is known that the overexpression of ATF4 inhibits the proliferation and differentiation of mammary epithelial cells, leading to lactation dysfunction and accelerated mammary gland involution [81]. Our results showed that ATF4 was highly expressed during the LL stage, possibly to promote the involution of mammary gland tissue and apoptosis during the LL stage ( Fig. 7A-D). In addition, the expression of JUNB increases after forced weaning [82]. We also detected high JUNB expression in the DP stage, and we speculate that JUNB has effects on milk synthesis and secretion. CEBPG, FOSL2, and NFIL3 belong to the bZIP family, as do ATF4 and JUNB. bZIP TFs are present in all eukaryotes and are one of the largest families of dimeric TFs. It has been reported that this family of TFs regulates cell growth [83]. We found that this family of TFs had the greatest number of TFs with differential expression in the mammary gland; therefore, these TFs might be involved in mammary gland development and lactation regulation. In addition, four members of the bHLH family (ARNT, BHLHA15, BHLHE40, and MITF) were differentially expressed in the three developmental stages.
ARNT is known to be necessary for mammary gland development in adolescent females [84]. BHLHE40 is associated with breast cancer [85]. However, the function of BHLHA15 and MITF in the mammary gland has not yet been reported; therefore, further study on their speci c role in mammary gland development is warranted. In summary, these 35 TFs are differentially expressed in mammary gland tissues during the three developmental stages and may play potential regulatory roles in gene expression in the mammary gland, thereby affecting mammary gland development and the physiological process of lactation.

Conclusions
We found that the mRNA expression levels of genes associated with mammary gland growth, differentiation, apoptosis, immunity, substance transport, substance synthesis and metabolism were signi cantly different among the three mammary gland development stages (LL, DP, and LG). The genes in the same family or genes and TFs with similar functions have potential similar expression patterns in mammary gland tissue at different developmental stages. Mammary gland development and the physiological process of lactation involve the synergy of multiple genes involved in regulation. We mapped the expression pro les of DEGs in the mammary gland during the three developmental stages, providing a theoretical reference for further understanding the molecular mechanisms of mammary gland development and lactation.

Ethics statement
The methods and experimental procedures used in this study were all approved and directed by the animal protection and ethics committee of Shandong Agricultural University (protocol number: SDAUA-2018-048). All participating personnel were required to undergo rigorous training and self-protection and to alleviate the suffering of experimental animals to the greatest extent possible.
Experimental animal treatment and sample collection In this study, nine healthy and disease-free Laoshan dairy goats were used, all of which came from Aote goat breeding farm (Qingdao, Shandong province, China), and all goats had free access to food and were managed under the same conditions . They were 4 years old, on average (third parity). Mammary gland tissues were collected at the LL (n = 3, 240 d postpartum), DP (n = 3, 300 d postpartum), and LG (n = 3, 140 d after mating) stages. The speci c tissue collection methods were as follows. After intravenous injection of pentobarbital sodium (100 mg/kg) for anesthesia, the muscles relaxed, and the heart and respiration were arrested; then, the goats were quickly dissected, and mammary gland tissues were harvested. Tissues were quickly placed in RNase-free cryotubes and stored in liquid nitrogen.

RNA extraction and sequencing analysis
Total RNA was extracted using a Trizol Based RNAiso Plus kit (Code No 9108Q, Takara). RNA quality was analyzed using a NanoDrop 2000 C (Thermo). A sample was used for subsequent experimental analysis when the RNA integrity number (RIN) was > 8. Nine sequencing libraries (LL1, LL2, LL3, DP1, DP2, DP3, LG1, LG2, and LG3) were constructed using TruSeq RNA Library Prep Kit v2 (Illumina), and sequencing was completed using a HiSeq 2500 platform. After obtaining the raw data, sequencing quality was assessed using FastQC software (http://darlinglab.org/tutorials/fastqc/), and the sequencing adapters (adapter sequences, primers, poly-A tails and other types of unwanted sequence from high-throughput sequencing reads) [86] were processed through Trimmomatic software. HISAT2 software [87] was used to construct indices of the goat genome (ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/001/704/415/GCF_001704415.1_ARS1), and sequence alignment was performed. The number and percentage of sequence alignment were quanti ed using SAMtools software [88]. StringTie software was used to assemble the sequence alignments to transcripts [89]. The expression level of each gene was quanti ed using HTSeq [90].Statistical analysis of expression levels and differential expression analysis. First, low-expression genes (FPKM < 0.5) were ltered based on fragments per kilobase of transcript per million mapped reads (FPKM). The screening criteria for each gene were as follows: when FPKM ≥ 500, the gene was de ned as highly expressed; when 500 > FPKM ≥ 10, the gene was de ned as moderately expressed; when 10 > FPKM ≥ 1, the gene was de ned as lowly expressed; and when 1> FPKM ≥ 0.5, the gene was de ned as ultralowly expressed. Prior to differential expression analysis, principal component analysis and intersample correlation analysis were performed on gene expression levels. Then, differential expression analysis was performed using DESeq2 software [91]. When comparing the same gene between two groups, if the false discovery rate (FDR) was ≤ 0.05 and the log2-fold change was ≥ 1, the gene was considered differentially expressed. The DEGs between different groups were analyzed, and differential gene expression was displayed using volcano plots and heat maps. The top 20 DEGs for each developmental stage (LL, DP, and LG) were identi ed and are displayed using histograms.
Gene Ontology (GO) annotation and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis The DEGs identi ed in each group were annotated using the GO database and the KEGG database. The GO terms and KEGG pathways that met FDR ≤ 0.05 were selected for downstream analysis. To further clarify and re ne the function of genes, we rst classi ed GO terms into multiple categories, including apoptosis-related, biosynthesis-related, growth and development-related, metabolism-related, substance transport-related, and other important function-related terms. Similarly, KEGG pathways were also classi ed into apoptosis-related, biosynthesis-related, immunity-related, metabolism-related, and other function-related pathways. Finally, the GO terms and KEGG pathways related to mammary gland development and lactation function are presented as bar graphs and bubble plots, respectively.

Functional classi cation of DEGs and analysis of expression patterns
To further clarify the relationship between speci c genes and different important GO terms, we constructed six gene-concept networks, including development/growth, apoptosis, immunity, transport, biosynthesis, and metabolism networks. Subsequently, the patterns of gene expression changes in the network are shown in heat maps. The FPKM values of the gene expression levels were normalized based on Z-score before the heat map was plotted, and then the genes were hierarchically clustered.

Screening and analysis of transcription factors
Transcription factors (TFs) are key regulators of gene expression [92]. To clarify the classi cation of TFs among the DEGs, we used the AnimalTFDB 3.0 website [93]. DEGs at three developmental stages were annotated. Subsequently, the number of TFs contained in different families was counted, and the expression patterns of these differentially expressed TFs were analyzed.
Veri cation of the sequencing accuracy using real-time quantitative reverse transcription polymerase chain reaction (qRT-PCR) To further verify the accuracy of the transcriptome sequencing results, we randomly selected 15 DEGs and designed primers using NCBI Primer-BLAST [94]. Reverse transcription and PCR ampli cation were performed using a One Step TB Green® PrimeScriptTM RT-PCR Kit (Takara). Quantitative PCR was performed using a LightCycler 96 uorescence quantitative PCR instrument (Roche). The relative expression levels of genes during the three developmental stages were calculated using 2 -△△CT [95].
Tukey's honestly signi cant difference (HSD) test was used for the statistical analysis of gene expression levels at different developmental stages. Unless otherwise noted, the above plots were all generated using R software (Version 3.6.0).

Declarations
Ethics approval and consent to participate The methods and experimental procedures used in this study were all approved and directed by the animal protection and ethics committee of Shandong Agricultural University (protocol number: SDAUA-2018-048). All participating personnel were required to undergo rigorous training and self-protection and to alleviate the suffering of experimental animals to the greatest extent possible.

Consent for publication
Not applicable.

Availability of data and materials
All data generated or analysed during this study are included in this published article and its supplementary information les.

Competing interests
The authors declared that they had no competing interests.    GO annotation of DEGs in the three developmental stages A GO functional annotation bar graphs for the DEGs identi ed in the LL vs DP comparison group. B GO functional annotation bar graphs for the DEGs identi ed in the DP vs LG comparison group. C GO functional annotation bar graphs for the DEGs identi ed in the LL vs LG comparison group.