Analysis of PM physicochemical factors
The levels of PM moisture, TN, AP, pH, and acetic, caproic, butyric, and lactic acid content from different cellar samples changed significantly with increasing age of the cellar (Table 1, Fig. 2). pH is an important factor that has a cumulative effect on the properties of PM; however, it did not show differences in different spatial locations in the 30-, 100-, and 300-year-old cellars (p>0.05). The relative moisture content showed a significant difference in the 100- and 300-year-old cellars. The moisture content in position 4 was significantly lower than that in positions 1 and 2 (p<0.05). For the same positions, the moisture content in the 300-year-old cellars was slightly higher than that in the 100- and 30-year-old cellars, but the difference was not significant (p>0.05). The moisture content produced by microbial metabolism gradually accumulated and sank to the bottom of the cellar during fermentation, resulting in higher moisture content at the bottom of the cellar compared with that at other spatial positions. Nutrients are also important contributors to the biochemical properties of PM. The concentration of TN at position 4 was significantly lower than that at other positions in the 30-, 100-, and 300-year-old cellars (p<0.05). Notably, the concentration of TN at position 1 in the 300-year-old cellar was significantly lower than that in the 30- and 100-year-old cellars. The concentration of AP at position 1 in the 100- and 300-year-old cellars was significantly higher than that at positions 2, 3, and 4 (p<0.05) and at positions 2 and 4, respectively.
Organic acids are important precursors of esters in CSFL and inevitably affect the quality of CSFL. We compared the organic acid contents spatiotemporally in the PM. The acetic acid content (Fig. 2A) was significantly higher at position 3 (p<0.05) and position 1 (p<0.05) than that at other positions in the 30- and 100-year-old cellars and 300-year cellars, respectively. The butyric acid content (Fig. 2B) showed no significant spatial difference in the 30-year-old cells (p> 0.05) but was significantly higher at position 1 than that at other spatial locations (p<0.05) in the 100- and 300-year-old cellars. The caproic acid (CA) content (Fig. 2C) showed different trends in the different cellar types. The CA content at positions 2 and 1 was significantly higher than that at other positions (p <0.05) in the 30- and 300-year-old cellars, respectively. As for lactic acid content (Fig. 2D), positions 4 and 1 showed significantly higher levels than other spatial locations in the 30- (p>0.05) and in the 100- and 300-year-old cellars (p<0.05), respectively. In summary, organic acids accumulated at the bottom of the pit with continuous CSFL fermentation, leading to a high content of organic acid at the bottom of the pit, when compared with other spatial locations. In addition, the total organic acid content increased with the continuous use of pits.
Microbial succession in different cellars
Spatiotemporal differences of microbial diversity in PM
Succession patterns of PM microbial communities in 30-, 100-, and 300-year-old cellars were investigated. High-throughput sequencing was applied to obtain information on the active microbial community in the three cellar types at four positions, and sequence similarity thresholds were used to define the OTUs. Based on the sequencing results, the total reads were correlated with bacterial and archaeal phyla. The Good’s coverage estimator revealed that 91.72–94.05% of bacterial OTUs were obtained in all PM samples, which indicates that the generated data were reasonable reflection of the microbiota community. In addition, the bacterial Shannon index with a range of 5.41±0.96 to 5.48±0.86 and the bacterial Simpson index with a range of 0.9±0.07 to 0.95±0.02 at different PM samples were obtained, and there were significant differences (p<0.05) among the different PM samples (Table 1), suggesting similar overall species diversity.
For bacterial community composition, only prokaryotic abundance differences were found, without prokaryotic structural differences. At the phylum level, Firmicutes, Bacteroidetes, Euryarchaeota, Actinobacterium, Chloroflexi, and Proteobacteria were predominant in the 30-, 100-, and 300-year-old PM samples. At the genus level, 221 genera were detected. Bacterial genera such as Caproiciproducens, Clostridium, Lactobacillus, Bacillus, Weissella, Sedimentibacter, Atopobium, Brachybacterium, Ralstonia, Acinetobacter, Proteiniphilum, Anaerocella, Prevotella, Bellilinea, Petrimonas, Staphylococcus, Caloramator, Lysinibacillus, Caldicoprobacter, Caloribacterium and archeal genera such as Methanobacterium and Methanobrevibacter were detected with relative abundances higher than 0.1% in the 30-, 100-, and 300-year-old PM samples (Fig. 3). In this study, we defined core microbiota genera as those detected in all PM samples and with relative abundance higher than 1.0% in all samples. Based on this, the core microbiota genera were Lactobacillus, Caproiciproducens, Clostridium, Proteiniphilum, Prevotella, Bellilinea, Petrimonas, Methanobacterium, and Methanobrevibacter and accounted for 60.28%, 84.39%, and 80.65% in the 30-, 100- and 300-year-old PM samples, respectively. Notably, Prevotella, Bellilinea, Petrimonas, Methanobacterium, and Methanobrevibacter were increased in the 100- and 300-year-old PM samples when compared with those in the 30-year-old PM samples, especially for Methanobacterium and Methanobrevibacter. We further investigated the differential microbes among the three PM samples. The results (Fig. 3D) indicated that Proteiniphilum, Caproiciproducens, Methanobacterium, and Methanobrevibacter were the main differential microorganisms among the 30-, 100-, and 300-year-old samples. With the continuous use of the cellars, microbes will inevitably undergo succession progress. We calculated the beta nearest taxonomic index (βNTI) value to compare the influence of the two (deterministic and stochastic) assembly processes of the 30-, 100-, and 300-year-old cellars (Fig. 3E) to compare and analyse the differences in the construction of microbial communities in the different PM cellars. The results showed that both deterministic and stochastic processes promoted the assembly of the microbiome in the PM cellars, with the pit used for a long time, especially for 300-year-old cellars, the microbial composition follows a deterministic process.
Microbial diversity in different spatial locations
The cellar environment changed with the CSFL fermentation process, such as the Huangshui generation. It gradually changed from a micro-anaerobic to an anaerobic environment, leading to a diversity of microbial communities in different spatial locations. At position 1 (Additional file: Figure S1-A), Lactobacillus, Bellilinea, Methanobrevibacter, Petrimonas, and Prevotella were found in the 30-, 100-, and 300-year-old cellars. Petrimonas (13.18%) showed a higher relative abundance in the 100-year-old PM samples than that in the 30- and 300-year-old PM samples. Moreover, Lactobacillus (41.82%) and Prevotella (1.78%) were higher in the 300-year-old PM samples than in the 30- and 100-year-old PM samples, but the relative abundance of Bellilinea (0.7%) for the 300-year cellar was lower than that for the 30- and 100-year samples. In contrast, Bacillus, Weissella, Brachybacterium, Staphylococcus, Anaerocella, Caloramator, Caldicoprobacter, and Caloribacterium were only detected in the 30-year, 100-year, and 300-year-old PM samples, respectively.
At position 2 (Additional file: Figure S1-B), the dominant microbes were Lactobacillus, Bellilinea, Methanobrevibacter, Methanobacterium, Petrimonas, Caproiciproducens, and Prevotella. The Prevotella (12.38%) and Lactobacillus (12.34%) in the 30-year-old samples were higher than in the 100- and 300-year-old PM samples. Bellilinea (18.04%), Petrimonas (21.18%), and Methanobrevibacter (30.16%) in the 300-year-old samples were higher than in the 30- and 100-year-old samples, and Methanobacterium (9.96%) and Caproiciproducens (15.63%) in the 100-year-old samples were higher than in the 30- and 300-year-old samples. Anaerocella and Lysinbacillus were only detected in the 30- and 300-year-old samples, respectively.
At position 3 (Additional file: Figure S1-C), the core microbes were Bellilinea, Methanobrevibacter, Methanobacterium, Petrimonas, Proteiniphilum, Clostridium, and Prevotella. Prevotella (26.45%) and Methanobacterium (9.28%) in the 30-year-old samples were higher than in the 100- and 300-year-old samples, whereas Proteiniphilum (14.27%), Bellilinea (29.09%), and Clostridium (6.68%) in the 300-year-old samples were higher than in the 30- and 100-year-old samples. Methanobrevibacter (41.79%) in the 100-year-old PM samples was higher than in the 30- and 300-year-old PM samples.
At position 4 (Additional file: Figure S1-D), the core microbes were Bellilinea, Methanobrevibacter, Petrimonas, Lactobacillus, and Prevotella. Petrimonas (19.27%) and Prevotella (14.53%) in the 30-year-old PM samples were higher than in the 100- and 300-year-old PM samples, and Methanobrevibacter (52.33%) and Lactobacillus (21.78%) in the 100-year-old samples were higher than in the 30- and 300-year-old samples. Moreover, Bellilinea in the 300-year-old PM samples was higher than in the 30-and 100-year-old PM samples.
Differences in the physicochemical properties of different spatial positions in the pit, such as oxygen, moisture content, nutrients, and the contact between the jiupei (fermented grains) and PM, will inevitably lead to differences in microbial composition and abundance. Notably, the relative abundance of the same microorganism is also different in the use of the same year pit. In summary, the microbiota composition among the four spatial locations were different; for example, Caproiciproducens and Lactobacillus were concentrated at the bottom of the cellar, while Proteiniphilum and Clostridium were highly abundant at positions 3 and 4. In addition, some genera were only detected in specific spatial locations, indicating the complex interactions between the microorganisms. Moreover, the microbial composition also correlated with the physicochemical environment in the cellars, and the diversity of microbes could influence the flavour substances in CSFL.
After studying the microbial communities of different PM samples, we further evaluated the similarities and differences in the microbiota using a Bray-Curtis approach. A non-metric multi-dimensional scaling analysis revealed that all samples in bacteria were clustered into two parts and showed year-dependent clustering (Fig. 4A). In summary, samples from the 30-, 100-, and 300-year-old samples were centralised and formed clusters, especially for 30- and 100-year-old samples, suggesting that samples from different years are possibly in different transitional states and bacterial communities in the 30- and 100-year-old cellars had similar microbial compositions. The microbial community continues to evolve over time, stabilises, and gradually forms a stable microbial environment. Therefore, the age of the cellar is the main factor driving microbial succession.
Relationships of microbial community with physicochemical properties
Environmental factors in fermentation can affect the growth and metabolism of microorganisms. Hence, physicochemical factors (pH, moisture content, TN, AP, acetate, caproate, butyrate, and lactate) were selected and analysed to reveal their connection with the microbiota community (Fig. 4B). The results indicate that the contents of lactate, butyrate, acetate, and caproate significantly affected microbial succession change (p<0.05) and that lactate and caproate contents were positively correlated with prokaryotic communities in the 30-, 100-, and 300-year-old samples at positions 1 and 2. Moreover, butyrate and acetate contents were positively correlated with prokaryotic communities in the 30- and 100-year-old PM samples at position 2.
To clarify the relationships between specific genera and driving factors, we analysed their correlation using Spearman’s coefficient (p<0.05). Organic acids significantly affected the core microbiota Lactobacillus, Caproiciproducens, Clostridium, Proteiniphilum, Prevotella, Bellilinea, Petrimonas, Methanobacterium, and Methanobrevibacter, while TN, AP, and moisture content affected a few families. Specifically, at position 1 (Additional file: Figure S2-A), lactic acid was positively correlated with levels of Caproiciproducens. In addition, butyric acid and acetic acid were positively correlated with Methanobacterium and Caproiciproducens, Caproic acid was negatively correlated with Petrimonas, and TN was negatively correlated with Prevotella and Lactobacillus. At position 2 (Additional file: Figure S2-B), lactic acid was negatively correlated with Clostridium yet positively correlated with Lactobacillus, butyric acid was negatively correlated with Petrimonas yet positively correlated with Prevotella, acetic acid was positively correlated with Lactobacillus and Caproiciproducens, CA was negatively correlated with Petrimonas yet positively correlated with Lactobacillus, and moisture content was negatively correlated with Prevotella. At position 3 (Additional file: Figure S2-C), lactic acid and AP were the two important factors influencing microbial community composition. For example, lactic acid was positively correlated with Methanobrevibacter, but AP was negatively correlated with Ralstonia, Bellilinea, and Proteiniphilum and negatively correlated with Clostridium. At position 4 (Additional file: Figure S2-D), CA and butyric acid were negatively correlated with Methanobrevibacter, while butyric acid and lactic acid were positively correlated with Methanobacterium.
In summary, physicochemical factors such as TN, AP, and organic acids played an important role in determining the microbiota composition, and microbes in different spatial locations were affected by different factors, because the microbial composition differed spatially.
Metabolomic analysis of PM
In the untargeted metabolomics analysis, we identified 5991 metabolites, of which 714 were identified by gas chromatography (GC-MS), and 5277 were identified by liquid chromatography (LC-MS) from PM samples collected over different years that are continuously used. After filtering and quality control (deletion of unknown compounds), 982 metabolites were left for further analysis.
Classification of total metabolites
We first classified the metabolites detected by GC-MS and LC-MS. The results showed that amino acids, peptides, and analogues, fatty acids, and conjugates were the main metabolites and accounted for approximately half of the metabolites (Additional file: Figure S3). Metabolites from PM also contribute to the liquor flavour. Fatty acids, amino acids, and peptides are decisive compounds in CSFL because they can produce flavour compounds as precursors in CSFL production.
Analysis of differential metabolites
We selected the metabolites according to the standard of P value <0.1 and VIP value> 1 to obtain the differential metabolites in different cell types.
In the position under the Huangshui fluid, there were 129 differential metabolites in the 30- and 100-year-old cellars (Additional file: Figure S4-A), of which 105 were up-regulated and 24 were down-regulated. There were 86 differential metabolites in the 30- and 300-year-old cellars (Additional file: Figure S4-B), of which 60 were up-regulated and 26 were down-regulated. Of 100 differential metabolites found in the 100- and 300-year-old cellars (Additional file: Figure S4-C), 41 were up-regulated and 59 were down-regulated.
In the position of the Huangshui fluid, 68 differential metabolites were found in the 30- and 100-year-old PM samples (Additional file: Figure S4-D), of which 29 were up-regulated and 39 were down-regulated, and of the 247 differential metabolites in the 30- and 300-year-old cellars (Additional file: Figure S4-E), 95 were up-regulated and 152 were down-regulated. There were 178 differential metabolites in the 100- and 300-year-old cellars (Additional file: Figure S4-F), of which 77 were up-regulated and 101 were down-regulated.
Analysis of differential metabolic pathways between different cellars
The main flavour compounds in CSFL are esters, alcohols, aldehydes, ketones, phenol, and pyrazine compounds. According to our results, differential metabolites were related to a variety of metabolic pathways. We selected differential metabolites for their potential participation in the substrate degradation and flavour development of PM. Dodecanoic, caprylic, myristic, arachidonic, γ-linolenic, and eicosapentaenoic acids were related to the biosynthesis of fatty acids and unsaturated fatty acid pathways, while threonine, asparagine, phenylalanine, isoleucine, aspartic acid, valine, proline, and phenylpyruvic acid were related to the biosynthesis of amino acids. Malic acid, succinic acid, and γ-aminobutyric acid were related to the butanoate and pyruvate metabolism pathways. The relative concentrations of the metabolites were also compared. Under the Huangshui fluid (Table S1), compared to the 30-year-old samples, the compounds γ-aminobutyric acid, threonine, proline, phenylalanine, dodecanoic acid, succinic acid, and valine were increased in the 100-year-old PM samples and decreased in the 300-year-old PM samples. The concentrations of arachidonic acid, eicosapentaenoic acid, caprylic acid, malic acid, myristic acid, isoleucine, glyceric acid, and β-hydroxypyruvate in the 30-year-old PM samples were higher than in the 100- and 300-year-old PM samples. The concentrations of phenylpyruvic acid and γ-linolenic acid in the 300-year-old samples were higher than in the 30- and 100-year-old PM samples. At the position of the Huangshui fluid (Table S2), compared to the 30-year-old samples, the compounds dodecanoic acid and α-linolenic acid were increased in the 100-year-old PM samples and decreased in the 300-year-old samples. The concentrations of homoserine, saccharopine, threonine, succinic acid, and aminoadipic acid in the 30-year-old PM samples were higher than in the 100- and 300-year-old PM samples. The concentrations of γ-linolenic acid, aspartic acid, caprylic acid, glucuronic acid, myristic acid, asparagine, and arachidonic acid in the 300-year-old PM samples were higher than in the 30- and 100-year-old PM samples. Figure 6A shows the reciprocal connections between the differential metabolites and KEGG pathways.
We further analysed the relationship between core microbiota and PM metabolites (Additional file: Figure S5). The correlation coefficients indicated a strong relation (0.6<|Spearman’s|<1). Seven of the nine main genera were involved in the distribution of certain metabolites. Positive correlations were observed between Caproiciproducens and isoleucine, gluconic lactone, and glyceric acid, Lactobacillus and CA, Methanobrevibacter and lactic acid, and Bellilinea and Proteiniphilum and caprylic acid.
Metagenomic analysis
Metagenomic microbial community composition
Among all sequences that passed the quality control, 62.15–64.1%, 21.81–27.92%, and 0.01% were identified as fragments originating from bacteria, archaea, and eukaryotes, respectively. The dominant bacterial phyla were Firmicutes (19.84–35.71%) and Bacteroidetes (8.11–19.57%). The archaeal reads were mainly affiliated with Euryarchaeota (21.75–27.8%). A small proportion of eukaryotic reads (less than 0.1%) were detected in the PM samples. This indicated that the PM microbiota was dominated by prokaryotes (Additional file: Figure S6).
Functional gene profiles of the PM metagenome
To investigate the metabolic potential of the PM microbiome, microbial genes aligned against the KEGG database among all PM samples were categorised into three levels of pathways based on metagenomic sequencing data. The functional gene profile at level 2 was mainly composed of carbohydrate metabolism, amino acid metabolism, energy metabolism, and translation metabolism in PM samples (Additional file: Figure S6-C).
We selected 135 enzymes from the KEGG database for their potential participation in the substrate degradation and flavour development of PM, and they were then grouped into 20 functional assemblies. Starch and sucrose metabolism are the primary sources of all compounds in the process of fermentation. According to starch and sucrose metabolism, the key enzymes alpha-amylase (EC 3.2.1.1) and glucan 1,4-alpha-maltohydrolase (EC 3.2.1.133) can catalyse starch hydrolysis to dextrin and maltose. The major producers of these two enzymes were Clostridiales (mainly Clostridium), Anaerolineales (mainly Bellilinea), Methanomicrobiales (mainly Methanoculleus), and Methanosarcinales (mainly Methanosarcina). Glucose is generally generated from starch during fermentation and is converted to pyruvate through the activities of aldose 1-epimerase (EC 5.1.3.3) and glucokinase (EC 2.7.1.2), while aldose 1-epimerase catalyses the interconversion of α- and β-terminal isomers of hexose, which are mainly produced by Petrimonas, Prevotella, and Clostridium. Moreover, glucokinase is related to the degradation of glucose and can catalyse glucose phosphorylation, which is mainly produced by Petrimonas, Clostridium, Prevotella, Methanoculleus, and Syntrophomonas. The above-mentioned microbiota genera were considered the major users of polysaccharides, and the generated glucose could be utilised by other taxa.
As for the flavour-generation metabolism pathways, CA is generally produced by anaerobic bacteria via the chain elongation pathway. The oxidation of ethanol can provide energy (acetyl-CoA) to sequentially elongate the carbon chain of carboxylic acids (acetic acid to n-butyric acid to CA). Petrimonas, Methanoculleus, Syntrophomonas, and Methanobacterium were the most prevalent genera that produced the enzymes (EC 2.3.1.9, 1.3.1.9, 2.3.1.16, and 4.2.1.55) involved in CA synthesis. Moreover, Lactobacillus, Clostridium, and Methanoculleus were the predominant genera that produced lactic acid from pyruvate, mainly catalysed by L-lactate dehydrogenase (EC 1.1.1.27) and D-lactate dehydrogenase (EC 1.1.1.28). In addition, lactate dehydrogenase (EC 1.1.2.3, EC 1.1.2.4), which can consume lactic acid to produce pyruvate, reduces the production of lactic acid. Conversely, Sedimentibacter and Clostridium were the main potential groups converting lactate back to pyruvate via D-lactate dehydrogenase (EC 1.1.2.4), while only a few species were annotated to produce lactate dehydrogenase (EC 1.1.2.3). Petrimonas was the predominant microbiota using pyruvate dehydrogenase (EC 1.2.4.1 and EC 2.3.1.12), which oxidised pyruvate to generate acetyl-CoA. This is a critical link in multiple metabolic pathways, contributing to aromas in CSFL, and can be oxidised to alcohol via acetyl-CoA synthetases (EC 6.2.1.13), aldehyde dehydrogenase (EC 1.2.1.3), and alcohol dehydrogenase (EC 1.1.1.1, EC 1.1.1.2). Among these, alcohol dehydrogenase is one of the most important enzymes involved in ester substance generation according to the alcohol–aldehyde–carboxylate–ester metabolism pathway. The gene coding for alcohol dehydrogenase was mainly enriched in microbes belonging to Clostridium. Butyrate is another basic aroma in CSFL and is produced from butyryl-CoA by the catalytic activities of acetate CoA transferase (EC 2.8.3.8) or butyrate kinase (EC 2.7.2.7). The genus source of butyrate kinase (EC 2.7.2.7) was Caloramator, Petrimonas, and Sedimentibacter, while the genus source of acetate CoA-transferase (EC 2.8.3.8) was Petrimonas, Sedimentibacter, and Proteiniphilum. Genes of various pathways to form acetate were detected in the microbiota, with acetyl-CoA and acetyl-P as intermediates. The key enzymes aldehyde dehydrogenase (EC 1.2.1.3), acetate kinase (EC 2.7.2.1), acetate-CoA ligase (EC 6.2.1.13), and pyruvate dehydrogenase (EC 1.2.5.1) involved in acetate metabolism were mainly produced by Clostridiales, Bacteroidales, and Anaerolineales, while acetyl-CoA synthetase (EC 6.2.1.1) was mainly produced by Methanobacteriales, Methanosarcinales, and Methanomicrobiales. The critical enzymes acetate CoA-transferase (EC 2.8.3.18) and acetyl-CoA synthetases (EC 6.2.1.13) helped acetyl-CoA change into acetate. Archaea were also the dominant microbes in the 30-, 100-, and 300-year-old PM samples, while cdhA (EC 1.2.7.4), acetyl-CoA synthetase (EC 6.2.1.1), and formate dehydrogenases (EC 1.17.1.9 and EC 1.17.1.10) were the most important enzymes related to methane metabolism. Acetyl-CoA synthetase (EC 6.2.1.1) catalyses the conversion of acetic acid to acetyl-CoA, and formate dehydrogenase (EC 1.17.1.9) catalyses the conversion of formate to CO2, while EC 1.17.1.10 catalyses the conversion of CO2 to formate.
Metabolic network related to substrate–flavour metabolism in PM microbiota
To better explain the link among the 20 functional assemblies summarised in Fig. 5, a specific metabolic network in PM microbiota among the nine orders and 30 microbial genera is presented schematically in Fig. 6, showing the pattern of enzyme-coding genes involved in the substrate–flavour metabolic pathway.