Study area and sample collection
Kuoqionggangri glaciers (KQGR, 29°51′48″N, 90°12′39″E) are located in southern regions of the Tibetan Plateau (Figure S1). As one of the largest glaciers in Nyainqen Tangglha range, KQGR are mainly controlled by the South Asian monsoon (Zhang et al., 2021c). The annual mean air temperature in this region is -1.28℃ and the annual mean precipitation is ~ 550 mm, with approximately 90% of the precipitation occurring in the summer season from June to September (Gao et al., 2021a). KQGR consist of seven small glaciers, and at least nine glacier lakes are expanding as the glacier calving. Among them, seven lakes are originated from a rapidly retreating KQGR Ⅰ glacier (Figure S1). Of the seven lakes, three lakes are in close proximity to the glacier but are located in different directions of the glacier. The fourth lake is connected to lake 3, which lake 5 is connected to the glacier by a stream. Lake 6 and lake 7 are situated behind a ridge, and has lost connect to the glacier (Figure S1).
In this study, we collected surface snow (0-15cm) on KQGR Ⅰ glacier, paired water and surface sediment samples were collected synchronously from seven nearby pro-glacier lakes on 11 July 2020 (Figure S1). In each lake, at least three replicate water and sediment samples were collected. In total, 16 snow samples, 24 water and 24 sediment samples. For snow sampling, we used a sterilized Teflon shovel to transfer surface snow into sterile 700 ml Whirl-Pak bags. The samples were stored in cooling containers and transported to the laboratory and slowly thawed over 24–48 h. After melting, approximately 500 ml snowmelt water was filtered onto a 0.22 µm polycarbonate membrane (47 mm, Millipore, USA) for DNA extraction. For water sampling, 1 L of water was collected at depths of 10–20 cm with a 1000 mL Ruttner water sampler (Hydro-Bios, Altenholz, Germany) (Liu et al., 2021) into a sterile plastic bottle. In the laboratory, approximately 500 ml water first filtered through 20 µm mesh (Millipore, USA) to remove large particles, and then filtered through a 0.22 µm polycarbonate membrane. Another portion was used for physicochemical analysis. Surface sediment samples were collected with stainless-steel shovel and placed into the sterile Whirlpak bags. Subsequently, samples were brought to the laboratory and stored at -80 ℃ until further analysis.
Estimation of glacial influence
To assess the influence of glaciers on bacterioplankton biodiversity, glacier index (GI) was calculated according to (Jacobsen and Dangles, 2012):
$$\text{G}\text{I}=\frac{\sqrt{\text{S}\text{i}\text{z}\text{e}}}{\text{D}\text{i}\text{s}\text{t}+\sqrt{\text{S}\text{i}\text{z}\text{e}}}$$
where GI is the glacier index, Size is the glacier size (km2), and Dist is the distance of the sampling site from the glacier terminus (km). GI value of 1 indicates that the distance between the sampling site and the glacier terminus is zero, where the glacier has the greatest influence, while GI value of zero means that the glacier has the least influence.
Physicochemical analysis
Sampling site coordinates and altitude were collected using a Global Positioning System (GPS). Lake area and straight-line distances from sampling site to glacier terminus were calculated from remote sensing image based on GPS coordinates (Kohler et al., 2022).
The water temperature (Temp), pH, conductivity (Con), total dissolved solids (TDS) were measured in the field using a multiprobe Water Quality Sonde (Hydrolab DS5, Hach, USA). In the laboratory, water samples were filtered through 0.45 µm hydrophilic polyethersulfone syringe filter (25 mm, Anpel) to determine the dissolved organic carbon (DOC), total nitrogen (TN), nitrate (NO3−) and sulfate (SO42−). DOC and TN were measured by a TOC-Lcph (Shimadzu Corp., Japan) following the standard methods (APHA 2005). The NO3− and SO42− concentration was determined with Dionex ion chromatography (IC) system 600 (Thermo Fisher Scientific Instrument Co., LTD). Sediment pH and conductivity was determined in a 1:5 sediment/water suspension using a PHS-3C pH and DDS-307A conductivity meter, respectively. The sediment total carbon (TC) was determined using a TOC-Lcph analyzer. The sediment NH4+-N and NO3− -N were extracted using a ratio of 5 g sediment to 25 mL 2 mol/liter KCl. After shaking for 1 h, the extracts were filtered using a filter paper and the concentrations were measured using a continuous flow Injection analyzer (Smartchem200) (Hou et al., 2019). Total nitrogen (TN) and total phosphorus (TP) contents was determined by digestion H2SO4 and H2SO4-HClO4, respectively, and ultimately determined via flow analyzer (Smartchem200). Details about the samples and physicochemical measurements are in the table S1.
DNA extraction and sequencing
The total bacterial DNA was extracted from the filter membrane or 0.5 g sediment samples using the MoBio Power Soil DNA isolation kit (MoBio Laboratories, Carlsbad, CA, USA) according to the manufacturer’s instructions. The extracted DNA was resuspended into 100 µL of nuclease-free water and quantified using a Qubit 4.0 fluorometer with the dsDNA Broad Range assay kit (Invitrogen, Singapore).
The V4-V5 region of the bacterial 16S ribosomal RNA (rRNA) gene was ampli-fied using primer set 515F (5′-GTGCCAGCMGCCGCGGTAA-3′) and 907R (5′-CCGTCAATTCMTTTRAGTTT-3′) (Chen et al., 2020a). The 50 µL polymerase chain reaction (PCR) reaction systems were prepared in triplicate and each contained 10 ng DNA template, 5 µL 2x Premix Taq DNA polymerase (Takara Biotechnology, Dalian, China), 1 µL of each primer (10 µM) and 20 µL of nuclease-free water. The PCR reaction was performed under the condition: 94℃ for 3 min, 30 cycles of 94℃ for 30 s, 60℃ for 30 s, 72℃ for 1 min and a final cycle of 5 min at 72℃. Afterwards, PCR products were purified with Agencourt AMpure XP beads, pooled at equal volume, and sequenced on an Illumina MiSeq machine (Illumina, San Diego, CA) with a paired-end strategy. All raw sequence obtained in this study have been deposited into the National Center for Biotechnology Information (NCBI) Sequence Reads Archive (SRA) database under the BioProject number PRJNA873432.
Sequence analysis
Raw reads of 16S rRNA gene amplicons were quality-controlled and analyzed using QIIME2 (2020.04) pipeline. Basic criteria were respected following the official suggestions. DADA2 was used for the detection and correction of amplicon sequence data, as well as the generation of the phylogenetic tree. Taxonomic assignment of sequences was assigned using the Naive Bayes classifier trained by the Silva 138 16S rRNA database (Quast et al., 2012). With the pipeline provided by QIIME2, Amplicon Sequence Variants (ASVs) were produced in place of Operational Taxonomic Units (OTUs) for a higher resolution of taxonomic annotations (Glassman and Martiny, 2018). ASVs that were annotated as chloroplasts, archaeal and unassigned sequences (not affiliated with Bacteria) were excluded from subsequent analysis. Overall, we obtained a total of 1605568 high-quality sequences from 64 samples, which were clustered into 10820 ASVs at the cutoff of 97% sequence identity.
Estimation of the stochastic ratio in community assembly
The community assembly processes of water and sediment bacterial communities were evaluated by using neutral and null models. The neutral community model (NCM) was applied to evaluate the potential importance of stochastic processes on community assembly (Sloan et al., 2006). The null model-based stochastic ratio was calculated using the modified method with the Bray-Curtis distance to determine the relative importance of stochastic and deterministic processes in shaping community assembly (Guo et al., 2018; Liang et al., 2020). The normalized stochasticity ratio (NST) is performed in R using the “NST” package to reflect the contribution of stochastic processes based on relative differences between the observed situation and the null expectation (Ning et al., 2019). The NST index defines 0.5 as the boundary point to determine whether the community assembly is more deterministic (< 0.5) or more stochastic (> 0.5). Levins’ niche breadth index was calculated for both water and sediment bacterial communities using the “niche.width” function within R package “spaa” (Zhang and Zhang, 2013). In general, a bacterial community with a wide niche breadth is expected to be more metabolically flexible at the community level than one with a narrow niche breadth.
Network analysis
To estimate how species coexistence changes in different habitats as glaciers retreat, we first divided the lakes into two groups based on GI values: lakes close to the glacier terminus with high GI values (HGL, lake 1 to lake 3), and lakes away from the glacier terminus with low GI values (LGL, lake 4 to lake 7). Then, constructed separate correlation networks for 24 water and 24 sediment samples. Finally, subnetworks of HGL and LGL were extracted from the water and sediment network using the induced_subgraph function in “igraph” package in R (Csardi and Nepusz, 2006) and analyzed individually. All Robust correlations with Spearman’s correlation coefficients (ρ) > 0.8 and false discovery rate [FDR]-adjusted P -value < 0.01 were used to construct networks. Network visualization was generated with Gephi version 0.9.2 (Bastian et al., 2009). Each node represents one ASVs, and each edge represents a strong and significant correlation between two nodes. For each network, we calculated a set of topological features: average path length, network diameter, average degree, clustering coefficient, and graph density. Higher average degree, clustering coefficient, and graph density suggest a more connected network. In addition, lower average path lengths and diameters indicate closer associations in the network (Barberán et al., 2012; Ma et al., 2016). Furthermore, the 1000 Erdös-Réyni random networks, which exhibit the same number of nodes and edges as the real networks, were calculated in the “igraph” R package.
Community cohesion was calculated to quantify the degree of connectivity of the water and sediment bacterial communities (Herren and McMahon, 2017). In this analysis, positive and negative cohesions were calculated for each sample as the sum of the significant positive or negative correlations between taxa weighted by OTU abundances. We use the absolute value of negative:positive cohesion to test whether changes in network stability are caused by changes in GI values.
Statistical analyses
The Bayesian-based SourceTracker method was performed to quantify the extent of contribution of potential source to the sink (Knights et al., 2011). In the present study, SourceTracker analysis were performed twice: (i) lake water samples were defined as sink, and the snow samples as potential source, and (ii) both lake water and snow samples were considered as potential sources and sediment samples as sinks. Spearman correlation analysis was used to determine the bacterial genera in water and sediment samples that were significantly correlated with the distance from the glacier. Bacterial alpha diversity indices, including the species richness and Shannon index were calculated for each sample in the R package “vegan”. The Bray–Curtis dissimilarities of bacterial communities between water and sediment were examined using Wilcoxon rank-sum tests, performed using the wilcox.test function in “stats” package in R (Field et al., 2012). Water and sediment bacterial community compositions were visualized using non-metric multidimensional scaling (NMDS) based on Bray-Curtis dissimilarities. Vegan’s functions metaMDS was used for ordinations, envfit was used to fit environmental vectors and ordisurf, which uses generalized additive models, was used to visualize the sampling site distance from the glacier in the ordinations (Oksanen et al., 2013). Analysis of similarity (ANOSIM) was used to evaluate differences in bacterial communities between habitats. All statistical analyses were performed in R3.6.1 (http://www.r-project.org). The distance-based multivariate linear model (DistLM) based on the community Bray-Curtis dissimilarities was performed in the DISTLM_forward3 program to determine the combination of environmental variables that significantly explain the observed dissimilarity among the samples (McArdle and Anderson, 2001).