Water sampling and characterization
A total of 48 water samples were collected in Spring and Autumn from three main urban recreational waters in Changchun City. 8 samples from Nanhu Lake (NH), 8 samples from Jingyue Lake (JY), 8 samples from Beihu Lake (BH) were obtained in the Autumn of 2018 and Spring of 2019, respectively. The sampling sites in Spring and Autumn were the same. The specific longitude and latitude coordinates were shown in (Table S1). Water samples were collected at 0.2 m below the surface of the lakeshore with a fixed-depth water sampler. A total of 5 L water sample was collected from each site. The samples were transported to the laboratory on ice within 8 h. 1 L sample was saved under 4 °C temporally for survival study. 2 L water sample was filtered through 0.22 μm membrane, and the samples were saved in -80 °C fridge for community DNA extraction. Water physicochemical properties were determined with well-developed methods (see below). TN (total soluble nitrogen) was qualified by potassium persulfate oxidation-double wavelength spectrophotometry method. NH4+-N (ammonia nitrogen) was qualified by Nessler reagent spectrophotometry. TP (total phosphorus) was measured by molybdenum-antimony-ascorbic acid spectrophotometric. TOC (total organic carbon) was measured by a total organic carbon analyzer (TOC-VCPH, Shimazu, Japan). The EC (electrical conductivity) was determined by s conductivity meter (DDS-11A, Rex, China). The pH was determined by a pH meter (FE20, Mettler, China).
Bacterial Strains
The strain used in this experiment was E. coli O157:H7 EDL931 (ATCC 35150), conferring toxic genes including stx1, stx2, and eae. The EDL931 was originally obtained from human feces (Beery et al., 1984). The EDL931 wild type was tagged with rifampicin resistance for ease of counting (Ma et al. 2011).
Survival experiment of E. coli O157:H7
The EcO157 cells were initially inoculated in LB liquid medium and incubated at 37 °C with shaking (220 r·min-1) for 15 h. The cells were harvested by centrifugation at 4 ℃, 18,000 g for 10 min. Then the cell pellet was washed three times by 0.9% sodium chloride buffer, and finally resuspended in sterile deionized water. The cells were then starved for 2 h in dark under 4 °C. The starved cells were then inoculated into lake water samples, and the concentration of EcO157 solution was about 1.0×108 CFU/mL. The inoculated water samples were sampled periodically, and subjected to 10-fold serial diluted. Fifty microliters of the two highest dilutions were plated in duplicate onto the SMAC (sorbitol MacConkey) agar (Lab M, Lancashire, UK) containing 100 mg/L of rifampicin (Ma et al. 2011). The plates were incubated at 37 °C for EcO157 counting. The detailed process of count could be found in our previous publications (Han et al., 2021; Ma et al., 2014).
Survival data modeling
The modeling EcO157 survival was achieved by fitting the experimental data to the Weibull model (Mafart et al., 2002) using GInaFiT Excel add-in (Geeraerd et al. 2005). The model was established based on the assumption that the survival of EcO157 follows the Weibull distribution. The number of the survivors can be quantified by using the following equation:

where Nt, N0, andt represent survivor counts, inoculum size, and inoculation time, respectively. The δ is scale parameter representing the time needed for the first decimal reduction, p is the shape parameter. Survival curves display a convex, concave, and linear shape when p > 1, < 1, and = 1, respectively. The survival time (ttd) represents time (days) needed to reach the detection limit could also be obtained by fitting the Weibull model.
Water DNA extraction, sequencing, and sequencing data processing
Water community DNA was extracted from prepared fiber membranes from each of the 48 water samples. The V3 and V4 regions of 16S rDNA were amplified using forward primers containing the sequence “5’-CCTACGGRRBGCASCAGKVRVGAAT” and reverse primers containing the sequence “5’-GGACTACNVGGGTWTCTAATCC”. PCR products were detected by 1.5% agarose gel electrophoresis. DNA libraries concentration was validated by Qubit 3.0 Fluorometer. DNA libraries were multiplexed and loaded on an Illumina MiSeq instrument according to the instructions (Illumina, San Diego, CA, USA) (Caporaso et al., 2012). PE250/300 paired-end was used to sequence. Image analysis and base calling were conducted by the MiSeq Control Software (MCS) embedded in the MiSeq instrument. The purified chimeric sequences were used for OTU clustering by VSEARCH clustering (1.9.6) (sequence similarity is set to 97%). The used 16s rRNA reference database was Silva, 132. Bayesian algorithm was used to analyze representative sequences and classify community composition under different species classification level. The sequencing data have been deposited with links to NCBI under accession number PRJNA725369.
Statistical analysis
The bar charts of survival parameters, physicochemical properties and relative abundances of major bacterial phyla were plotted by OriginPro 9.1 (OriginLab, USA). Principal component analysis (PCA) was constructed to visualize the differences of physicochemical properties in different samples with the vegan package of R 4.0.4. Principal coordinate analysis (PCoA) was performed to appear the differences of bacterial community in Bray-Curtis conversion by vegan package. Multivariate analysis of variance (MANOVA) was conducted with SPSS 26 (IBM, USA) in order to test the effects of lake, season and interaction (Lake×Season) on survival parameters of EcO157 and water physicochemical properties. Dissimilarity analysis was applied to indigenous bacterial community by Multi-Response Permutation Procedure (mrpp), Permutational MANOVA (adonis) and Analysis of Similarities (anosim) based on four distance conversion measures of Bray-Curtis, Horn, Gower and Jaccard (vegan package of R 4.0.4).
Mantel and partial Mantel tests (non-parametric statistical methods) were calculated with the vegan package of R 4.0.4. The Mantel test is a correlation test between two matrices including measured variables (Guillot and Rousset, 2013). Partial mantel test also measures correlation, but controls for the effect of a third matrix including other variables (Guillot and Rousset, 2013). The influence of bacterial community and physicochemical properties on the survival of EcO157 was tested by Mantel and partial Mantel. In addition, Pearson correlation between ttd and relative abundances of phylum bacterial taxa in Spring and Autumn samples was calculated respectively.
To clearly display the correlations between survival parameters, physicochemical properties and bacterial community, co-occurrence network analysis based on Pearson correlation (p < 0.05) was conducted. In order to simplify the network, the correlations between bacteria were deliberately removed. Co-occurrence network analysis and plotting were jointly completed through the Hmisc package of R 4.0.4 and Gephi 0.9.2(Bastian et al. 2009).
In order to further reveal the influence of environmental factors, variation partition analysis (VPA) was used to quantify the explanatory contribution of partitioning factor matrices to the overall ttd variation of EcO157. VPA was performed after obtaining the corrected R2 (total explanatory proportion) of Redundancy analysis (RDA) with R 4.0.4 (Dixon, 2003) using vegan package. VPA can be output in the style of Venn diagram. The overlap in the Venn diagram indicated combined contributions of multiple matrices (Peres-Neto et al., 2006).
The structural equation model (SEM) can more accurately identify the direct and indirect effects of environmental factors on the persistence of EcO157. SEM was performed by AMOS 22.0 (Amos Development, Spring House, PA, USA). Direct effect, indirect effect and total effect were calculated based on standard path coefficients (λ) which reflect the strength of the relationship between variables (Igolkina and Samsonova, 2018). Model fitness is the degree of consistency between the hypothetical theoretical model and the actual data (Wyeld and Nakayama, 2019). The model fitness parameters and criteria are as follows: χ2 > 0.05, chi-square; p > 0.05, p value; CMIN/DF < 3, ratio of χ2 and degrees of freedom; GFI > 0.9, goodness of fit index; RMSEA < 0.08, root mean square error of approximation.