Study site and soil sampling
This study was conducted in an inland semiarid saline ecosystem around the largest salt lake, Huamachi lake, in Shaanxi, China. The Huamachi lake is located in Dingbian of Shaanxi province (37º68′N, 107º53′E). The study site has a temperate continental semi-arid monsoon climate, with a mean annual temperature of 7.9°C, mean annual precipitation of 312 mm, and mean evaporation demand of 2523 mm. Precipitation primarily occurs from July to September. The soil in the dry lakebed and the surrounding region is a saline soil, with a texture of sandy loam, an EC of 6.72–12.08 mS cm− 1 and a salinity of 2.79–5.79%. The covering vegetation is dominated by Phragmites australis, Calamagrostis epigeios, Kalidium foliatum, Suaeda salsa, Nitraria sibirica and Tamarix chinensis, with a canopy cover of > 60%.
In early July 2020, we established nine adjacent plots (10 m × 10 m) for the sampling of rhizosphere and bulk soils in the surrounding region of Huamachi lake. In each plot, we randomly selected six plants for each of the six halophytes (P. australis, C. epigeios, K. foliatum, S. salsa, N. sibirica and T. chinensis) for the field sampling of rhizosphere samples. The roots of each plant from 0–15 cm depth were carefully taken out, lightly shaken to remove loosely bound soils, then the soils that tightly attached to the roots (rhizosphere soils) were brushed off to compose the rhizosphere sample of each halophyte for the plot. We also collected 6 soil samples at 0–15 cm depth from bare land without any plant using a sterilized soil auger to combine as a composite bulk soil sample for the plot. Totally, we have 63 samples (six rhizosphere samples and one bulk soil sample for each of the nine plots). All the moisture rhizosphere and bulk soil samples were stored in ice box, transported to local laboratory and then divided into three sub-samples: one air-dried for the measurement of soil physiochemical properties, one stored at -20°C for the measurement of extracellular enzyme activities, and one stored at -80°C for high-throughput sequencing.
Laboratory analysis for soil properties and extracellular enzyme activities
A small fraction of moisture soils was dried at 105°C to measure soil moisture content. Another fraction of moisture soil was used for the measurement of available nitrogen (NH4+ and NO3−). Air-dried soils were ground to pass through a 2-mm sieve for the measurement of available phosphorous (OP), soil pH, EC and salinity. A small fraction of < 2-mm samples were ground to pass through a 0.25-mm sieve for the measurement of SOC and TN. Soil metrics were measured using standard methods as described by Qiu et al. . The SOC and TN were measured using the Walkley-Black and Kjeldahl method. Soil NH4+ and NO3− were measured using a continuous flow analyzer (AutoAnalyzer-AA3, Seal Analytical, Norderstedt, Germany) after extraction with 2 mol L− 1 KCl. Soil OP was determined by the Olsen method. Soil pH and EC were measured in a soil: water (1:5) extract with a pH meter (Mettler Toledo, Germany). Soil salinity was measured by extracting soil samples with deionized water in a soil to solution ratio of 1:5 and drying the extracts to constant weight.
The activities of extracellular enzymes involved in carbon, nitrogen and phosphorous cycles were measured using Microplate-scale fluorometric method . The carbon-cycling related enzymes are β-1,4-glucosidase (BG), 1,4-β-D-cellobiohydrolase (CBH) and β-xylosidase (BX), the nitrogen-cycling related enzyme is β-1,4-N-acetyl-glucosaminidase (NAG), the phosphorous-cycling related enzyme is acid phosphatase (AP). Briefly, 3 g moisture soil sample was combined with 125 mL Tris buffer (50 mM Tris) in a crystal dish and homogenized to a uniform suspension using a blender for approximately 1 min. The pH value of the buffer was adjusted to be similar to that of soil sample with HCl or NaOH. The soil suspension (150 µl) and test enzyme substrate (50 µl) were added to the 96 microwell enzyme plate. We set up the blank control (150 µl soil suspension + 50 µl buffer), negative control (150 µl buffer + 50 µl enzyme substrate) and quench standard (150 µl buffer + 50 µl standard solution) for each sample. The microtiter plates were incubated at 25ºC in the dark, AP for 0.5 h, BG for 2 h, CBH, BX and NAG for 4 h. The fluorescence signals for the enzymes were obtained at 365 nm excitation and 450 nm emission (BioTek, Synergy TM LX, USA). The average of the z-score of each extracellular enzyme activity was calculated to indicate soil microbial activity [53, 58].
The content of total amino sugar (TAS) was measured to indicate microbial residue in soils . Briefly, the soil samples were hydrolyzed with 10 mL of 6 M HCl at 105℃ for 8 h containing myo-inositol. The hydrolysate was filtered after adding myo-inositol, adjusted to pH 6.6–6.8 and centrifuged. The supernatant was freeze-dried and the amino sugars in residues were extracted with methanol. The recovered amino sugars were first cyanated with hydroxylamine hydrochloride and 4-dimethylaminopyridine, and then acetylated with acetic anhydride to form aldononitrile derivatives. The AS derivatives were analyzed via a Beifen 3420A gas chromatograph (Beijing Beifen- Ruili Analytical Instrument. Co. Ltd., Beijing, China) equipped with a DB-5 column (50 m × 0.25 mm × 0.25 µm) and a flame ionization detector. The detailed temperature program was set by Zhang and Amelung . Methylglucamine was added as the recovery standard before derivation. Each individual amino sugar was quantified, and then summed to calculate total amino sugars content.
DNA extraction, PCR and sequencing
Soil total genomic DNA of each sample was extracted using the MP FastDNA spin kit for soil according to the manufacturer instructions (MP Biomedicals, Solon, OH, USA). The 16S rRNA gene was amplified by polymerase chain reaction (PCR) using the targeting primer pairs 341F (5′-CCTAYGGGRBGCASCAG-3′) and 806R (5′-GGACTACHVGGGTWTCTAAT-3). The ITS1 rDNA gene was amplified by PCR using the targeting primer pairs ITS5-1737F (GGAAGTAAAAGTCGTAACAAGG) and ITS2-2043R (GCTGCGTTCTTCATCGATGC). The PCRs were carried out with 30 µL reactions mixture containing 6 µM primers, 10 ng template DNA, 15µL Phusion® High-Fidelity PCR Master Mix 2× (New England BioLabs) and 2µL H2O. The samples were amplified by the following conditions: initial denaturation at 98°C for 1 min, followed by 30 cycles of denaturation at 98°C for 10 s, annealing at 50°C for 30 s, extension at 72°C for 30 s and final extension at 72°C for 5 min. All samples were amplified in triplicate and pooled together in equal amounts, and then subjected to electrophoresis detection in a 2% (w/v) agarose gel. Then PCR products were purified with GeneJET Gel Extraction Kit (Thermo Scientific). The sequencing libraries with sample-specific index tags were constructed using TruSeq® DNA PCR-Free Sample Preparation Kit (Illumina, USA). The library quality was assessed by the Qubit® 2.0 Fluorometer (Thermo Scientific) and Agilent Bioanalyzer 2100 system. Sequencing was performed on the Illumina NovaSeq platform to generate 250-bp paired-end reads at the Novogene Company, Beijing, China.
Raw paired-end reads were assigned to samples based on the unique barcode and truncated by cutting off the barcode and primer sequence, and then merged using FLASH (V1.2.7) with default parameters . Then, sequences with more than three consecutive low-quality base calls (Phred quality score smaller than 20) were truncated, and only reads with > 75% consecutive high-quality base calls and no ambiguous characters were retained for further analyses based on QIIME v1.9.1 . Chimeric sequences were removed using a combination of de novo and reference-based chimera checking based on UCHIME algorithm . Filtered sequences were clustered into operational taxonomic units (OTUs) using Uparse v7.0.1001  based on 97% similarity threshold. For 16S sequencing, the Mothur algorithm  and SILVA_v138 database (http://www.arb-silva.de/) were used for taxonomy assignment. For the ITS sequencing, representative sequence annotations were carried out by BLAST algorithm in the assign_taxonomy.py script based on UNITE v8.2 database (https://unite.ut.ee/). Multiple sequence alignments were conducted using the MUSCLE v 3.8.31 . We excluded chloroplast, mitochondria and rare OTUs (relative abundance < 0.005%) for bacteria and archaea, and filtered nonfungal and rare OTUs (relative abundance < 0.005%) for fungal dataset, respectively . As a result, a total of 1,562,706 high-quality sequences (range 15,751 − 34,824; media 24,388 sequences per sample) and 3,638,843 high-quality sequences (range 36,914 − 67,776; media 58,007 sequences per sample) were remained. Sequencing depths were rarefied to 15,751 and 36,914 sequences for bacteria/archaea and fungi, respectively. We also verified that the removal of rare OTUs did not affect our results based on the correlation analysis for alpha diversity (R2: 0.76–0.99, p < 0.0001; Fig. S11), Mantel tests (r: 0.9974–0.9995, p < 0.0001) and Procrustes analyses (M2: 0.0002–0.0027, p < 0.0001) in vegan package (Fig. S12) . In addition, we constructed OTUs rarefaction curves to evaluate richness saturation using ggrare function in ranacapa package (Fig. S13) . We used the Functional Annotation of Prokaryotic Taxa (FAPROTAX) , and FUNGuild  to extrapolate functional groups of bacteria and fungi in the soil, by using FAPROTAX_v1.2.1 and FUNGuild_v1.1 python script with default settings. We used only taxa that were rated as either “probable” or “highly probable” according to the confidence ranking for the guild assignment in order to avoid over-interpreting the fungal functional groups. The sequencing data were deposited into the NCBI (https://www.ncbi.nlm.nih.gov/) with bioprojects accession number PRJNA738372.
All statistical analyses were conducted in the R environment (v3.6.3, http://www.r-project.org/). To assess the microbial alpha diversity, community richness (Observed OTUs, Chao1 and ACE) and diversity (Shannon and Faith’s phylogeny diversity) indices of bacteria and fungi were calculated based on the rarefied data. Differences in soil properties, extracellular enzyme activities, microbial alpha diversity and dominant phyla among halophyte rhizospheres and bulk soils (habitat) were determined by linear mixed effect models, including habitat as a fixed effect, and plot as a random effect using nlme package . The assumptions of homoscedasticity and normality of residuals were inspected by Shapiro-Wilk test and residual versus fitted plots. Multiple comparisons were assessed by Tukey’s Honestly Significant Difference (Tukey’s HSD) in multcomp package  or Dunn's test in dunn.test package  when the assumptions were not met. Venn diagrams were conducted by using the venn function in venn package  to show the numbers of shared and unique observed OTUs among halophyte rhizospheres and bulk soils.
Constrained analysis of principal coordinates (CAP) based on Bary-Curtis distance between each pair of samples were performed to test the host effects of halophytes on soil bacterial and fungal community structures. Statistical significance of the CAP was measured by the permutest function with 999 permutations in the vegan package . To evaluate the significance of the host plant effects on soil bacterial and fungal community composition and heterogeneity, we performed permutational multivariate analysis of variance (PERMANOVA) and permutational analysis of multivariate dispersions (PERMDISP) on Bray-Curtis distance by using adonis and betadisper functions in the vegan package , respectively. The separations of mean values (distance to centroid) among habitats were evaluated by one-way ANOVA with Tukey’s HSD tests. All ordination analyses were performed using the phyloseq package . We also calculated the pairwise dissimilarity distances between each halophyte rhizosphere and bulk soil bacterial or fungal community based on Bray-Curtis distance metric to explore the differences in bacterial or fungal community structures across halophyte rhizospheres and bulk soils.
Indicator species analyses were performed to identify bacterial and fungal taxa significantly associated with a given habitats on OTUs level using the multipatt function in the indicspecies package . The analyses were permutated 999 times, and the significance of association between the taxa and the habitat was evaluated at a false discovery rate (FDR) corrected p value of < 0.05 . Bipartite networks were implemented to visualize the identified indicator taxa using Gephi software (https://gephi.org). To better characterize the host effects of halophytes on rhizosphere microbial composition, we retrieved the dominant taxa (top 50 abundant for bacterial and fungal OTUs, respectively) and identified whether they belong to the indicator taxa. Phylogenetic distributions and relative abundance of the dominant taxa were constructed using ggtree function in ggtree package  and using ggplot function in ggplot2 package .
The microbial co-occurrence networks were constructed based on 16S and ITS gene sequencing data to explore the bacterial-fungal co-occurrence relationships. Microbial co-occurrence networks were constructed using Molecular Ecological Network Analyses Pipeline (MENAP, http://ieg4.rccc.ou.edu/MENA/), using random matrix theory (RMT) to determine the correlation cut-off threshold automatically [97, 98]. The RMT theory was applied to define the transition point of nearest-neighbor spacing distribution of eigenvalues from Gaussian orthogonal ensemble (random) to Poisson (non-random) distribution, and this transition point was then used as the correlation cut-off for network construction [97, 98]. Briefly, only OTUs detected in at least eight replicates (out of nine replicates for each habitat) were used for each network construction. OTUs with missing values remained blank, and the pairwise correlation of the two pairs of OTUs were calculated based on relative abundance data and Pearson correlation coefficients. Then, the optimal threshold (0.92–0.94) recommended by RMT theory was used to filter the adjacency matrix (only correlations above the optimal threshold were used for defining the adjacency matrix) and construct co-occurrence networks.
Network topological indices were calculated in the MENAP interface, including node, edge, R2 of power-law distribution, average path distance (geodesic distance), harmonic geodesic distance, modularity and numbers of module. The smaller average path distance and harmonic geodesic distance indicate that the co-occurring network is close . In addition, we calculated the edge betweenness centrality by edge.betweenness function and counted numbers of bacterial node, fungal node, positive edge, negative edge, bacteria-bacteria edge, fungi-fungi edge, and bacteria-fungi edge in each network in igraph package . To validate the nonrandom co-occurrence patterns, 100 random networks were constructed for each empirical network in an equal size (constraining numbers of node and edge), following the Maslov-Sneppen procedure . The mean values and standard deviations of topology properties from 100 random network were calculated and compared with the corresponding empirical network. Topological roles of nodes in networks were determined by within-module connectivity (Zi) and among-module connectivity (Pi) and classified into four categories: peripheral nodes (Zi < 2.5, Pi < 0.62), connectors (Zi < 2.5, Pi > 0.62), module hubs (Zi > 2.5, Pi < 0.62), and network hubs (Zi > 2.5, Pi > 0.62). Module hubs are highly connected to many nodes in their own modules, while network hubs act as both module hubs and connectors. Network hubs, module hubs and connectors are regarded as keystone nodes [97, 98, 101]. All networks were visualized in R package igraph .
Cohesion, as a measure of the connectivity of microbial communities, was the sum of abundance-weighted and null model corrected index based on pairwise correlations across taxa [102, 103]. Cohesion can be used to identify associations among taxa caused by the interaction of positive and negative species and/or by the similarities and differences in the niches of microbial taxa [65, 102]. In order to evaluate the differences in cohesion values among different networks and to be consistent with the network analysis, we focused on those taxa that were present in the network [65, 104], and then calculating two cohesion values (positive and negative) for each sample:
where n is the total number of taxa in a given community, and the construction process is mainly divided into two steps, including the calculation of the positive and negative connectedness matrix of all taxa in each sample and the final calculation of cohesion. For example, we first subtracted the pairwise associations matrix of all taxa within a given community (observed correlations) from the pairwise associations matrix of the null model iterations (expected correlations) to obtain pairwise correlations matrix corrected by null model of each taxa. Then, the positive and negative connectedness matrix of each sample was generated by averaging the positive and negative null model-corrected correlations separately. Finally, the positive and negative cohesion value of each sample was obtained by multiplying relative abundance table by the connectedness metrics according to the above formula . The ranges of negative and positive cohesion were − 1 to 0 and 0 to 1, respectively. The more negative cohesion value indicates more complex microbial networks [65, 102]. Additionally, the absolute ratio of negative to positive cohesion were calculated to predict the co-occurrence network complexity, higher ratio indicates more complex microbial network [65, 102, 104, 105].
Variance partitioning analysis were conducted to evaluate the relative effects of types of host species, soil properties and electric conductivity on the variation in soil bacterial and fungal community composition, using the varpart function in the vegan package . Correlation analysis of microbial diversity and soil properties was conducted by using Pearson correlation coefficient with FDR corrected p value < 0.05 . Heatmaps of correlation coefficients were produced using the ggplot2 package .
We constructed a piecewise structural equation model to examine the causal pathways through which changes in soil nutrients and EC influence on microbial activity (multiple extracellular enzyme activity index) and total microbial residue (total amino sugar content), where both direct and indirect (via changing various aspects of community composition and network complexity) pathways were considered. Because SOC was positively correlated with soil nitrogen and phosphorous, we only included SOC in the model to indicate soil nutrients. The SOC was log-transformed and EC was sqrt-transformed. First, Akaike information criterion (AIC) was used to compare the linear mixed model with plot as random effect and the general linear model, and the model with the smaller AIC was selected to construct piecewise SEM. We use Shipley’s test of d-separation to test whether any paths are missing from the model and remove gradually the least significant paths from the models to improve model fit [106, 107]. We reported the standardized coefficient for each path from each component model. The final model was selected among the non-significant test, with higher Fishers-C (if P > 0.05, then no paths are missing and the model is a good fit)  and the lowest AIC. The SEMs were fitted by using the piecewiseSEM package .