The rhizosphere microbiota of the zinc and cadmium hyperaccumulators Arabidopsis halleri and Noccaea caerulescens is highly convergent in Prayon (Belgium)

The Prayon site is known as a zinc-polluted area where two zinc and cadmium hyperaccumulator plant species currently coexist, although Arabidopsis halleri was introduced more recently than Noccaea caerulescens . While soil microorganisms may in�uence metal uptake, the microbial community present in the rhizosphere of hyperaccumulators remains poorly known. Plants of both species were sampled with their bulk and rhizosphere soil from different plots of the Prayon site. Soil components (ionome, pH, water composition, temperature) were analyzed, as well as shoot ionome and expression levels of metal transporter genes ( HMA3 , HMA4 , ZIP4 / ZNT1 , ZIP6 , MTP1 ). The taxonomic diversity of the microorganisms in soil samples was then determined by 16S rRNA metabarcoding and compared at the Operational Taxonomy Unit (OTU) level and across different taxonomic levels. Our elemental analyses con�rmed that the site is still highly contaminated with zinc and cadmium and that both plant species indeed hyperaccumulate these elements in situ . Although the pollution is overall high, it is heterogenous at the site scale and correlates with the expression of some metal transporter genes. Metabarcoding analyses revealed a decreasing gradient of microbial diversity, with more OTUs discovered in the rhizosphere than in the soil bulk, especially at the bottom of the cores. However, the variability gradient increases with the distance from roots. Using an ad hoc pseudo-taxonomy to bypass the biases caused by a high proportion of unclassi�ed and unknown OTUs, we identi�ed Chloro�exi, Armatimonadetes, Pirellulaceae, Gemmatimonadetes and Chitinophagaceae as the drivers of the differences in the gradient along the cores. In contrast, no signi�cant difference was identi�ed between the rhizosphere composition of A. halleri and N. caerulescens . This suggests that, despite their distinct colonization history in Prayon, the two plant species have now recruited highly convergent microbial communities in the rhizosphere.


Introduction
Angiosperms, i.e., owering plants, are broadly distributed in a large variety of environments.One of such niches is metal-polluted soils, i.e., soils with an elevated concentration of metals, often of geogenic (e.g., for serpentine, Ni-rich, outcrops) or anthropogenic origin.In the latter case, these sites have been polluted either by mining activities, or by metal-transforming factories or through atmospheric deposits resulting from these activities.Plants living on metalpolluted sites are hypertolerant (Antonovics et al., 1971;Ernst, 2006): they can survive at metal concentrations much higher than the majority of plants.
Among those hypertolerant plants, some are also hyperaccumulators: they have the unique ability to take up metals from the ground and accumulate them within their leaves at high concentration (e.g., >0.3% Zn or 0.01% Cd in shoot dry weight, SDW) (Hanikenne and Nouet, 2011;Krämer, 2010;Merlot et al., 2018).
Such extremophile organisms are interesting for phytoremediation but also phytomining or bioforti cation (Clemens, 2017;Clemens et al., 2002;Lopez et al., 2019;Salt et al., 1998), i.e., the use of plants to clean up metal-contaminated sites, to mine metals or to enrich food for populations with element de ciencies, respectively.Two extensively studied hyperaccumulators are Arabidopsis halleri (L.) O'Kane & Al-Shesbaz and Noccaea caerulescens (J.Presl & C.Presl) F.K.Mey., 1973.These two Brassicaceae are plant models for zinc and cadmium hyperaccumulation and are able to live on polluted and unpolluted soils.
More and more molecular actors underlying metal hyperaccumulation have been identi ed over the last 15 years (Hanikenne and Nouet, 2011;Krämer, 2010;Merlot et al., 2018).Even though studies have shown the importance of root-associated microorganisms in this process (Li et al., 2007;Ma et al., 2011;Sessitsch et al., 2013), little is known about the rhizosphere of metal hyperaccumulators.This could be explained by the complexity of this micro-ecosystem (Alford et al., 2010).Yet, microorganisms are known to play a role in the physiology of diverse plants (such as rice, Pseudostellaria heterophylla or Amaranthus albus (Fitzpatrick et al., 2018;Long and Yao, 2019;Wu et al., 2019;Yu et al., 2019)), including a role in processes related to metal tolerance (Ullah et al., 2015).
Albeit A. halleri does not rely on mycorrhizal symbiosis (Delaux et al., 2014;Wang and Qiu, 2006), it can associate with soil bacteria in zinc/cadmium contaminated areas, and the associated community depends on the level of metal contamination of the soil (Gomez-Balderas et al., 2014, p.).Moreover, rhizosphere bacteria may in uence metal uptake by A. halleri (Farinati et al., 2011;Muehe et al., 2015).However, the taxonomic composition (Borymski et al., 2018) of rhizosphere microbial communities remains poorly known.
Our knowledge of the rhizosphere microbiota of N. caerulescens is even more limited, as available studies are more focused on the physical and chemical soil components in the rhizosphere than on the microbial communities themselves (Luo et al., 2018;Rosenfeld et al., 2018;Yang et al., 2011).Nevertheless, one study conducted in an agricultural soil prepared at this effect suggested that rhizosphere bacteria may also in uence metal uptake by N. caerulescens (Whiting et al., 2001).
As some rhizosphere microorganisms are species-speci c (Borymski et al., 2018;Fitzpatrick et al., 2018), it raises the question of whether the convergent evolution of hyperaccumulation in A. halleri and N. caerulescens extends to the microbial populations colonizing their rhizosphere.Yet, this is challenging to test because their geographic distribution does not overlap: N. caerulescens lives in the northwestern and western parts of Europe while A. halleri is found in more central and eastern countries of the continent (Krämer, 2010).
However, the two hyperaccumulators in fact coexist in one site in Belgium.This location is a small area of a vast zinc-polluted site in Prayon (near Liège, Fig. S1a).N. caerulescens massively colonizes the site, whereas A. halleri has a more limited distribution (Fig. S1c).To our knowledge, there has been no soil or ore characterization done in Prayon since 2001 (Meerts and Grommesch, 2001) and very few studies had been carried out previously (Denaeyer-De Smet and Duvigneaud, 1974;Duvigneaud and Jortay, 1987;Ramaut, 1964).
Historically, metal-oriented human activities in Prayon date back to 1503.However, the industrial activity related to zinc only began in 1829 and led to a rapid transformation of the vegetation.The activity progressively increased between 1883 and 1913, a period during which trees disappeared and metallophytes appeared (Duvigneaud and Jortay, 1987).Atmospheric and soil pollution caused by the zinc-ore smelter were rst described in 1964 (Ramaut, 1964).In that study, the zinc concentration (Fig. S1b, red dot) was measured at 540 mg kg -1 of dry soil (DW), i.e., more than twice the concentration considered "normal" at the time.
Consequences on the vegetation were already visible: the destruction of the original wooded ora, i.e., found before the zinc-ore smelter activity, was obvious, and hypertolerant plants were beginning to colonize the area with, at some places, N. caerulescens being the only plant species found (Ramaut, 1964).In 1974, the hyperaccumulator was largely present over the site.The soil pH (Fig. S1b, red dot) was between 5.5 and 6.6 and zinc concentrations had increased to values between 10,000 and 12,000 mg kg -1 (DW).Zinc concentration in N. caerulescens leaves was 23,950 mg kg -1 (DW) (Denaeyer-De Smet and Duvigneaud, 1974).
When the factory shut down in 1970, some of the original herbaceous species (e.g., Mercurialis perennis) eventually managed to recolonize the site but not all of them.At that time, N. caerulescens was well implanted on the polluted soil around the zinc-ore smelter.Some individuals were collected in 1987 in order to determine zinc, cadmium and lead leaf concentrations, which were measured at 24,700, 1,465 and 800 mg kg -1 , respectively (Duvigneaud and Jortay, 1987).However, as pollution was quite widespread (hundreds of hectares, up to 2 km downwind, Fig. S1a) (Ramaut, 1964), we do not know precisely from which area of the Prayon site those individuals were sampled, even if the zinc concentration was similar to those obtained in 1974 (Denaeyer-De Smet and Duvigneaud, 1974) in the area of interest (Fig. S1b).
In 2001, the latest study in Prayon showed that there was a large variation of soil parameters across the whole site.Measurements performed on calcareous samples (Fig. S1a) varied from 5.02 to 6.18 for pH and from 2100 to 7800 mg of extractable zinc kg -1 (DW).Moreover, based on seed bank composition and abundance, the study suggested that large changes were not to be expected in the oristic composition in Prayon in the following years, and that indeed seems to have been the case until now, 20 year later (Meerts and Grommesch, 2001).
Concerning A. halleri on this site (Fig. S1c), the rst observation dates back to the early 2000's, according to naturalist accounts.Ironically, it might have been accidentally introduced by botanists scouting the site for N. caerulescens, but this hypothesis has not been supported by hard evidence so far.
Since A. halleri and N. caerulescens now coexist in Prayon, we sampled plants of both species at different plots with their bulk and rhizosphere soil and analyzed shoot and soil components as well as the taxonomic diversity present in soil samples.This revealed signi cant differences in microbiota between samples found at increasing distances of the roots, whereas the rhizospheres of the two plant species appeared to have a convergent microbial community.

Field sampling, sample preparation and characterization
Samples from the rhizosphere of both A. halleri and N. caerulescens were collected on the 6 of June 2018 at the same metallicolous site in Belgium (Prayon, near Liège), after more than two weeks of sunny weather (around 20°C).The eld sampling strategy (Fig. 1) consisted in choosing 6 random sampling plots within the zone where A. halleri was found, i.e., at the bottom of the slope, close to a shery pond (Fig. S1).Each plot had a surface of 1 m x 2 m, hence 2 m 2 .Within this perimeter, ve individuals of each species, together with their soil (core of 15 cm deep with a diameter of 8 cm) were collected and divided into four parts: shoot, rhizosphere, bulk 1 and bulk 2 (respectively, the upper and bottom parts of the core), which were each pooled per species and plot.Shoot samples for gene expression analysis were immediately crushed and stored in RNAlater® Invitrogen (ThermoFisher).GPS coordinates and soil temperature were recorded at harvest (resulting from the mean of ve measures carried out at each corner and center of each plot) (Table S1).
Rhizosphere soil samples were isolated by adapting the method of (Edwards et al., 2015).Brie y, roots were placed within a sterile Phosphate Buffer Saline (PBS) solution (NaCl 137 mM, KCl 2.7 mM, Na 2 HPO 4 10mM, KH 2 PO 4 1.76 mM, pH 7.4) then the solution was ltered through a 40-µm Nylon lter and centrifuged at 5000 x g during 10 minutes.Finally, the supernatant was removed before storage at -80°C.
Bulks were rst sieved using a mesh of 2 mm.For each bulk sample, a small part was stored at -80°C (in order to extract DNA) while the major part was weighed (fresh weight without stone = FW) before drying out for several weeks at 60°C.Dry soil was used to measure pH, water content, as well as exchangeable and extractable elements (Hendershot and Duquette, 1986;Stein et al., 2017).They were then weighed again (dry weight = DW).
The percentage of water content in bulks was computed as (FW-DW)/FW*100 while pH was determined following (Stein et al., 2017).Brie y, 3 g of dry soil was mixed with 7.5 mL of CaCl 2 (0.01 M) and shaked over the weekend.Samples were then centrifuged at 2000 x g during 2 min.pH was measured in the supernatant (with inoLab pH 7110).The procedure was repeated ve times for each sample and means were computed.The extraction of mineral elements from the soil was carried out with BaCl 2 0.1 M (exchangeable elements) and with HCl 0.1 M (extractable elements), as detailed in (Hendershot and Duquette, 1986;Stein et al., 2017).Brie y, 1 g of dry soil was shaken for 2 h and 30 min, respectively, then centrifuged and ltered through Whatman VWR 42 lter paper.
Finally, samples were submitted to element analysis by ICP-AES as described in (Nouet et al., 2015).
All soil samples were used for DNA extraction using the Qiagen DNeasy PowerSoil Pro Kit with the QIAcube robot and the IRT protocol.One extraction blank was generated for each plot.Small subunit ribosomal RNA genes (SSU rRNA 16S/18S) were ampli ed with a universal primer pair (515f modi ed / 926r (Walters et al., 2016): TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGGTGYCAGCMGCCGCGGTAA / GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGCCGYCAATTYMTTTRAGTTT).Ampli cations with bacterial-speci c (S-D-Bact-0517-a-S-17 / S-D-Bact-1061a-A-17 (Maciejewska et al., 2018): TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGGCCAGCAGCCGCGGTAA / GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGCRRCACGAGCTGACGAC) and archaeal-speci c (Uni519F / Arc908r (Jørgensen et al., 2013): TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGCAGCMGCCGCGGTAA / GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGCCCGCCAATTCCTTTAAGTT) primer pairs were also performed in an attempt to corroborate the results.At least one PCR blank was included with each primer pair.Universal and archaeal primers were designed to amplify V4-V5 regions while bacterial primers targeted V4-V6 regions.The program used for PCR ampli cation is provided in Table S3.
The amplicons were then sequenced (paired-end, 2 x 300 bp) using the Illumina MiSeq 600-cycle V3.Library preparation was done following the 16S metagenomic work ow from Illumina (IlluminaTechnicalSupport, n.d.).Raw reads were deposited in the SRA database with BioproJect number PRJNA717797.

Bioinformatic preprocessing
The global pipeline is described in Fig. S2 and our custom scripts are available in the gshare project (Bertrand et al., 2021)  Brie y, reads were merged, stripped from primer sequences, oriented, quality ltered then pooled together using the usearch (Edgar, 2010) (v11.0.667_i86linux64) software suite, following the pipeline described in the documentation (Fig. S2).Clustering into operational taxonomic units (OTUs) was carried out with the UPARSE algorithm of usearch (Edgar, 2013).The same software suite was used to create an OTU table and its normalization by random read sub-sampling (allowing us to analyze the same total number of reads per sample).
A ltration of OTUs was also done with the script prune-outliers.pl,now distributed with the software Forty-Two (D. Baurain; https://metacpan.org/dist/Bio-MUST-Apps-FortyTwo).This was an additional quality control to retain only OTUs that were similar to at least one of the other OTUs, i.e., it kept SSU rRNA sequences and discarded non-homologous sequences potentially produced by the usearch pipeline.It is based on a mutual detection by all-vs-all BLASTN searches across all samples, given that SSU rRNA sequences are conserved enough at the nucleotide level to match those of another organism, even if quite far from a taxonomic point of view.
Moreover, a phylogenetic tree of the ltered OTUs was computed with RAxML (Stamatakis, 2014) v8.1.17after multiple sequence alignment using mafft v7.453 (Katoh, 2002;Katoh and Standley, 2013).In order to emulate a taxonomy with nested levels for all OTUs, including unknown and poorly known organisms, patristic distances were extracted from the tree and fed to the R package MCL (Jäger, 2015) to partition OTUs into six sets of clusters of varying number and sizes.Those different sets were obtained by xing the in ation parameter at 2.5 and by giving the "phylo" threshold the values 0.004, 0.008, 0.010, 0.015, 0.020, 0.025.In the following, the resulting nested levels will be referred to as pseudo-taxonomy.
As shown in Fig. S2, a pilot analysis was rst carried out to determine the optimal sequencing depth.To this end, a second step of ltration allowed us to keep only OTUs with eight or more supporting reads, so as to lter out chimeras (Fig. S2).Rarefaction curves were then generated with the R package ggplot2 (Wickham, 2009).Barplots showing the results from the clustering analyses were also generated with the R package ggplot2 (Wickham, 2009).

Diversity and abundance analyses Taxonomic analysis
Alpha and beta diversity were computed with a custom Perl script: compute-stat-diversity.pl based on Bio::MUST::Core and Bio::Community modules (F.Angly; https://metacpan.org/dist/Bio-Community).Those diversity measures were then analyzed graphically using the R package ggplot2 (Wickham, 2009) (graphstat-alpha-div-tax.R and graph-stat-beta-div-tax.R) with ANOVA tests.Taxonomic diversity across samples was also studied through NMDS using the R package vegan (Oksanen et al., 2018) with a discretization in 10 categories following the OTU abundances (nmds_diversity_univ30k.R).The R package ggplot2 (Wickham, 2009) was used for barplots (barplot_all_tax.R) and paired t-tests were computed.Venn diagrams were computed using the R package eulerr (Larsson, 2019) Correlation with soil and plant metadata Variability within and between sample metadata was analyzed using the script g_shoot_soil_param.R based on vegan (Oksanen et al., 2018) and ggplot2 (Wickham, 2009) packages.ANOVA tests were computed.Correlations were computed with the vegan (Oksanen et al., 2018) package.

Results And Discussion
Characterization of the Prayon site The two hyperaccumulators A. halleri and N. caerulescens colonize the Prayon site (near Liège, Belgium, Fig. S1), enabling the comparison of their hyperaccumulation capacities in a natural location and the study of the speci cities of the microbial composition of their rhizospheres.Shoot, rhizosphere, bulk 1 and bulk 2 (top and bottom of the core) samples of each plant species were collected in six plots on this calamine site (Fig. 1) in order to determine if their convergent evolution was re ected in the rhizosphere microbiota.
Soil pH, water content, temperature and element composition, including zinc and cadmium, were measured in bulk 1 and bulk 2 soil samples (Fig. 2).pH and temperature were stable through the six plots: temperature ranged between 17.8 and 19.2°C and pH between 6.04 and 6.59.This pH is lower than in other studies of A. halleri (Corso et al., 2018;Stein et al., 2017) and some of N. caerulescens (Luo et al., 2018;Rees et al., 2019;Rosenfeld et al., 2018) even if it seems more common for N. caerulescens (Luo et al., 2018;Rosenfeld et al., 2018).Water content was slightly more variable, ranging between 29.52 and 40.47% of fresh weight for bulk 1 and between 27.41 and 58.14% for bulk 2, which is coherent with previous study (Luo et al., 2018).
Overall, a limited variation is observed for extractable (HCl) and exchangeable (BaCl 2 ) element concentrations between soil cores associated with A. halleri or N. caerulescens.Between bulk 1 and bulk 2 samples, only a few extractable elements varied (Al, Fe and Na), in contrast to exchangeable elements.Fe and Na are the most variable elements across plots (Fig. S4).As expected based on the history of the site (Denaeyer-De Smet and Duvigneaud, 1974;Duvigneaud and Jortay, 1987;Ramaut, 1964), a large zinc pollution is measured.Current values are largely higher than those measured in 1974 33 , even with respect to exchangeable rather than extractable zinc.This could be explained by the differences in the techniques used since the activity of the factory stopped in 1970.
Moreover, extractable Zn concentrations (i.e., 4,934 to 25,051 mg kg -1 for bulk 1 and 6,326 to 13,829 mg kg -1 for bulk 2) are largely superior to the reference values for Wallonia soil ("Législation/décret gestion et assainissement sols," n.d.).Indeed, natural soil is expected to contain a maximum Zn concentration of 196 mg kg -1 (DW) and the threshold for necessary intervention on industrial soil set by Wallonia regulations is 3,000 mg kg -1 .The same is true for cadmium, for which the intervention threshold is xed at 20 mg kg -1 .Here in Prayon, even the exchangeable concentrations (42.45 -174.77mg kg -1 ) are largely above this limitation.This is not the case for Pb and Cu, which have high values (extractable concentrations are respectively 500-1500 and 250-625 mg kg -1 ) but remain below intervention thresholds (respectively 1840 and 600 mg kg -1 ).
If we compare with concentrations in natural soils reported in other studies, Zn, Cu and Cd exchangeable concentrations are in the higher range of contaminated sites in which A. halleri can be found, and Pb exchangeable concentration is also within metallicolous site ranges (Stein et al., 2017).Moreover, Zn and Cd extractable concentrations are above most natural metallicolous sites colonized by A. halleri.Concerning N. caerulescens, the Prayon values of exchangeable Cd and Zn surpass total soil concentration found in other studies of natural soil (Rees et al., 2019) except one that is in the same range (Rosenfeld et al., 2018).This could explain the low plant diversity found in the sampling site.Along with A. halleri and N. caerulescens, only Armeria maritima, Viola lutea and Festuca ovina were identi ed, three metal hypertolerant but not hyperaccumulator plants.

Hyperaccumulation in plant shoot
Soil parameters in uence the accumulation levels in shoot, thus shoot samples (composed of ve pooled individuals) were collected within six 2-m 2 plots for each species (Fig. 1) and submitted to ionome pro ling and quanti cation of metal transporter gene expression.
Micro-and macronutrient shoot concentrations were mostly similar between the two plant species, with the exception of Mg, P and Mn concentrations, which were higher in A. halleri than in N. caerulescens (Fig. 3).Metal accumulation in shoot differed signi cantly between the two species only for Cd, where N. caerulescens hyperaccumulated more.In contrast, Zn and Pb concentration ranges did overlap between shoots of both plants.Finally, A. halleri and N.
caerulescens hyperaccumulated both Zn and Cd above the de ned hyperaccumulation thresholds of 3,000 and 100 mg kg -1 (DW), respectively (Krämer, 2010).Even if N. caerulescens is known to hyperaccumulate Cd, it is not the case for all ecotypes.It is particularly the case of the Prayon ecotype, which was described as unable to hyperaccumulate Cd (Rosenfeld et al., 2018).However, most studies used soils with lower Cd concentration, therefore limiting Cd accumulation.Furthermore, shoot Cd accumulation in the Prayon site is much lower than observed in the Ganges ecotype of N. caerulescens, a bona-de Cd hyperaccumulator ecotype (Halimaa et al., 2019).
We next examined the expression levels of metal transporter genes that were previously shown to be highly expressed in the shoot of A. halleri and N. caerulescens and linked to hyperaccumulation and/or tolerance (Hanikenne et al., 2008;Talke, 2006;van de Mortel et al., 2006): HMA3 (Zn and Cd vacuolar storage (Talke, 2006;Ueno et al., 2011)), HMA4 (Zn and Cd xylem (un)loading and distribution (Craciun et al., 2012;Hanikenne et al., 2008)), ZIP4/ZNT1 (Zn uptake and distribution (Lin et al., 2016;Talke, 2006)), ZIP6 (Cd tolerance (Spielmann et al., 2020;Wu et al., 2009)) and MTP1 (Zn vacuolar storage (Dräger et al., 2004;Gustin et al., 2009;Shahzad et al., 2010)).Note that information on the expression of those genes in plants in the eld is very limited.The HMA3, HMA4 and ZIP4/ZNT1 genes showed the largest variation in expression level (up to 3.4-fold) throughout samples, both in A. halleri and N. caerulescens.MTP1 seemed also to vary but to a lesser extent (Fig. S5, S6).Correlations of gene expression were computed with the element concentrations in shoot tissues.In A. halleri, a relation between the Pb concentration and ZIP4 expression is observed (p-value=0.0069), as well as between P and MTP1 (p-value=0.047).In N. caerulescens, a correlation is observed between the Mn concentration and the HMA4 expression (p-value=0.00046) and between Mg, Pb and K and MTP1 (p-value=0.02,0.035 and 0.062, respectively), as well as Pb and HMA4 (p-value=0.052) (Fig. S7).
Correlations were also observed between the soil concentration of some elements and the expression of related transporter genes.In A. halleri, it is the case between the concentration of Zn and the expression of ZIP4 (p-value=0.0099) as well as between Cd and HMA4, Cd and ZIP6, Na and ZIP4 (p-value=0.019,0.031 and 0.049, respectively).In N. caerulescens, correlations were found between ZNT1, HMA3, ZIP6 and HMA4 and several elements (Fig. S8).
Those correlations are relatively weak and at this point, disentangling causal relationships between gene expression and metal concentrations would be hazardous.Differences in the observations in A. halleri and N. caerulescens might be explained by their different physiology (Merlot et al., 2018), including distinct metal storage sites, in mesophyll and in epidermis for A. halleri and N. caerulescens, respectively.
Interestingly, in spite of a generally high level of zinc and cadmium pollution, the heterogeneity of the site of Prayon seems to correlate with the expression of some metal transporter genes.However, no correlation was found between the amount of zinc present in shoot and the amount present in soil, in contrast to Stein et al. 2017(Stein et al., 2017) (Fig. S9).

Microbial diversity
On the optimal sequencing depth Since microorganisms present in the rhizosphere impact plant physiology and can be species-speci c, a major goal of this study was to characterize as accurately as possible the microbial diversity of our soil samples, i.e., the number and relative abundance of each OTU or taxon.To this end, DNA was extracted from all soil samples (rhizosphere and bulks) and SSU rRNA (16S/18S) gene regions were ampli ed either with universal primers or with bacterial and archaeal-speci c primer pairs for corroboration.Amplicons were sequenced using the Illumina MiSeq technology.
We rst tested whether the planned sequencing depth enabled the identi cation of the whole microbial diversity in our samples.This pilot analysis was conducted on three samples (rhizosphere, bulk 1 and bulk 2) from ve pooled A. halleri individuals sampled in the same plot.Five different depths were tested, obtained by in silico sub-sampling of the available reads, resulting in a relative depth from 1x to 12x in each of the three samples.Reads from each depth were treated independently and processed through the pipeline described in Material and methods to identify the OTUs (i.e., potential organisms) present.
As observed in Fig. S10, the number of different taxa discovered for any sequencing depth and any taxonomic level reached a different plateau.Hence, at deeper depth, more organisms are found when replaying the whole pipeline, suggesting that these rarefaction curves are not informative about our ability to discover (or not) all the microbial richness in the samples.Because we observe this phenomenon whatever the taxonomic level analyzed, it is unlikely to be (entirely) due to "new" sequences resulting from undetected chimeras or sequencing errors.Moreover, similar results were obtained with all primer pairs.Yet, as many OTUs (up to 12.7% at the deepest depth) actually correspond to unclassi ed organisms according to SILVA taxonomy, it was di cult to determine whether increasing the sequencing depth really allowed sampling a signi cantly larger biodiversity.To address this issue, we computed a phylogenetic tree including all the discovered OTUs and labelled the leaves with the depth(s) at which each OTU was recovered (Suppl.File at [https://doi.org/10.6084/m9.gshare.14376785.v1]).This tree was then used to build a pseudo-taxonomy with ve nested levels of pseudo-taxa accounting for all OTUs, including those unclassi ed in the SILVA taxonomy.These analyses revealed that new pseudo-taxa consistently appear at all levels when increasing depth (Fig. S11).Hence, between 16.4 and 22.7% of the clusters segmenting the phylogenetic tree are only observed at the two deepest sequencing depths.
Those results suggest that there is no good way to ascertain the optimal sequencing depth to uncover the whole microbial richness, even if some studies try different approaches to resolve this problem (Alteio et al., 2020;Fuks et al., 2018;Xiao et al., 2018).Therefore, comparison between studies or samples should take into account the depth parameter in addition to the primer pair, the sequencing method and the bioinformatics pipeline used.In the present work, the sequencing depth was eventually set at 4x the depth of the pilot study, as it was the minimal depth providing a reasonable account of the real diversity.

Assembly and quanti cation of SSU rRNA OTUs
Altogether, a total of 36 samples (rhizosphere and bulks from both species in the six locations), along with extraction and ampli cation blanks, were sequenced by Illumina after ampli cation of a SSU rRNA fragment.The results presented in the following were obtained with the universal primer pair, which uncovered the largest richness (compared to bacterial and archaeal primer pairs).Indeed, data obtained with bacterial and archaeal primer pairs had to be normalized at 1,000 reads (instead of 30,000; see below).Moreover, they were not especially Bacteria or Archaea-speci c, contrary to our expectations, and did not allow us to nd additional organisms beyond those obtained with the universal primer pair.The corresponding data is nevertheless available at [https://doi.org/10.6084/m9.gshare.14258528.v1].
Reads were analyzed through the bioinformatics pipeline described in Material and methods in order to generate a normalized OTU table in which each OTU has been assigned a taxon based on SILVA taxonomy.The normalization was xed for all samples at 30,000 reads.It was the minimal number of reads for which we could have a reasonable taxonomic diversity with a maximum of our samples.The resulting OTU table counts 9,796 OTUs (out of the 11,328 originally identi ed by usearch) and 30 out of the 36 samples were conserved.Because of the 30,000-read normalization, we have lost the bulk 2 sample from A. halleri in the second plot and ve samples from the sixth plot (i.e., all except the bulk 1 from N. caerulescens).Finally, after the ltration step, 9,323 OTUs remained and were used in the following analyses.

Microbial diversity at the OTU level
The taxonomic diversity was then compared between these 30 samples.A non-metric multidimensional scaling (NMDS) was rst computed to visualize whether samples had similar (or different) microbial diversity comprising the number of different organisms and their abundance (Fig. 4a).A clear separation between each part of the core is observed, i.e., between rhizosphere, bulk 1 and bulk 2. It also shows a gradient of diversity from the roots to the bottom of the core, i.e., rhizosphere then bulk 1 (top of the core) then bulk 2 (bottom of the core).Interestingly, the projected distance between rhizosphere and bulks is larger than between the two bulks, suggesting a more different microbial diversity in the rhizosphere.Moreover, the variance within each type of sample apparently follows the same gradient.
Since the major variability, i.e., differences found in the diversity between samples, seems to be driven by the physical distance from the roots, which is in line with other studies (Praeg et al., 2019;Song et al., 2020), this analysis could hide potential differences in the microbial diversity between the rhizospheres of the two species (A.halleri and N. caerulescens).To check that possibility, another NMDS was computed for those speci c samples (Fig. 4b).While samples from either plant visually form distinct groups, their separation is not very clear, suggesting only subtle differences between rhizosphere compositions, less important than the differences between sample types (i.e., rhizosphere, bulk 1 and bulk 2).This is coherent with the distribution of OTUs in Venn diagrams, according to sample type or plant species (Fig. S12).
Statistical values for alpha-and beta-diversity were then computed at the OTU level within and between samples, respectively (Fig. 5) to better understand the variations revealed in the NMDS plots.A decreasing gradient of richness is observed from rhizosphere to bulk 2 (Fig. 5a), i.e., the number of different organisms is the highest in rhizospheres, which is fully coherent with the rst NMDS analysis (Fig. 4a).
However, Simpson's index of alpha-diversity, which takes into account both the number of "species" present and their abundances, highlights that the overall variability in bulk 1 is similar to the one found in the rhizosphere (Fig. S13a), even if there are more organisms that are different in the rhizosphere (Fig. 5a).
Simpson's index also shows a larger variability between bulk 2 samples (Fig. S13a), which is con rmed by Euclidean distances of beta-diversity, taking abundances into account too (Fig. 5b).Similarly, beta-diversity measures also show that bulk 2 differs more from bulk 1 and rhizosphere, which are very similar in terms of Euclidean distances computed on OTUs.
In summary, the taxonomic diversity is mostly driven by the physical distance from the roots, along a decreasing gradient, rhizospheres having the highest richness and Simpson's diversity but a smaller variability between samples than bulks, with bulk 2 having the highest variability and the smallest number of different organisms.Thus, the rhizosphere appears to be a more favorable microenvironnement than bulks but with more constraints.However, those constraints appear similar between the two plant species, considering the convergence of the microbial communities observed in their rhizosphere.

Microbial diversity at different taxonomic levels
The microbial composition (top-10 taxa) was then compared between samples at different taxonomic levels (Fig. 6a).Differences appear between rhizosphere and bulks, and maybe between plant species too.At the phylum level, even if the most represented phylum is Proteobacteria for all samples, it is less abundant within bulk 2.More Bacteroidetes and Actinobacteria are present in the rhizosphere.Moreover, Acidobacteria, Chloro exi and Firmicutes are less represented in the rhizosphere than in bulks, which may depend on metal contamination levels as suggested by observed results in the presence of severe Cdpollution in a previous study (Song et al., 2020).Interestingly, unclassi ed taxa (according to SILVA taxonomy) are among the 10 most represented taxa at each taxonomic level.
Based on the results obtained with OTUs (Fig. 5a), we expected to observe a larger richness in the rhizosphere.However, if we take into account the "other" taxa that are clustered together (i.e., the remaining taxa out of the 10 most abundant), it does not seem to be the case at the phylum level (Fig 6a).Indeed, bulk 2 has not only more "other" phyla but also a larger proportion of unclassi ed and unknown (= undef_tax) organisms.This difference might be biologically genuine and explained by visualization at a higher taxonomic level (i.e., phylum) than the OTUs used previously.As a case in point, the richness shows a loss of signal and a large variability that is not in line with observations on OTUs throughout the different taxonomic levels (Fig. S14).Regarding beta-diversity measures, they are even more variable across levels (Fig. S15).
Alternatively, the systematically large proportion of unclassi ed and unknown organisms might also impact the analyses, considering that information loss is greater at lower taxonomic levels, i.e., it is more di cult to reliably classify an OTU at the genus level than at the class or phylum level (Table S4, Fig. 6a).In detail, 6,186 out of 9,323 identi ed OTUs (66%) are classi ed at any level, and only 1,794 (19%) are a liated down to the genus level.The remaining OTUs (15%) are classi ed at various higher taxonomic levels, including 350 OTUs (4%) above the phylum level, thus remaining unknown for most practical purposes.Furthermore, taxonomic a liation levels are heterogeneous between sample types, with more reads classi ed at the genus level in the rhizosphere than in bulk 2 and the opposite at the phylum level (Fig. S16, Fig. 6a).

Microbial diversity with pseudo-taxonomic a liations
To bypass these biases created by missing taxonomic a liations, a pseudo-taxonomy was computed following the same strategy as for the sequencing depth pilot study (see above), but this time using all OTUs remaining after ltration (9,323, Fig. S2) to compute the phylogenetic tree (Table S4).As shown in Table S5, the number of different organisms found in each pseudo-taxonomic level is constant.
With this pseudo-taxonomy, richness measures were computed anew (Fig. S17).No particular signal seems to emerge at pseudo-class and -phylum levels.
However, the variability pattern observed at pseudo-species, -genus and -family levels is congruent with the pattern found with OTUs (Fig. 5a).Indeed, the richness is higher in the rhizosphere and follows a gradient of decreasing diversity through sample types.This was not observable above the OTU level when using the SILVA taxonomy (Fig. S14).
Beta-diversities were then recomputed on this pseudo-taxonomy (Fig. S18).They appear coherent both with Simpson's index (Fig. S19) and earlier analyses carried on OTUs (Fig. 5b).Indeed, there is a gradient throughout pairwise comparisons from rhizosphere to bulk 2 at the pseudo-order, -class and -phylum level, thereby con rming a smaller variability in the diversity found in the rhizosphere than in bulks, with bulk 2 being the most variable.Moreover, beta-diversity also shows a signi cant difference between the rhizospheres from A. halleri and N. caerulescens (Fig. S18) at multiple levels, as suggested by the second NMDS analysis (Fig. 4b) and the composition found with the pseudo-taxonomy (Fig. 6b).Indeed, differences (Fig. S18) between rhizosphere composition of the two plant species at pseudo-phylum, -class and -order are visually observed and some variability appears at pseudo-family and -species levels.
Euclidean distances between rhizospheres of A. halleri are almost systematically lower than those between rhizospheres of N. caerulescens (Fig. S18).This may suggest a sharper selection caused by the roots of A. halleri than by those of N. caerulescens.

Differences in the microbial diversity within rhizospheres
To investigate a possible difference of microbial composition in rhizosphere samples from both plant species, a paired t-test was computed across all six plots for all pseudo-taxonomic levels (Table S6a), using only the most abundant pseudo-taxa (i.e., those shown in Fig. 6b).However, not a single pseudo-taxon abundance appeared as signi cantly different at any taxonomic level, the best candidates being cluster 41 at the pseudo-family level and cluster 536 at the pseudo-species level, with a corrected p-value of 0.08 and 0.06, respectively.Cluster 41 is composed of OTUs that were taxonomically associated to Chloro exi (76%) and Armatimonadetes (4%) taxa (both from Terrabacteria group), as well as unclassi ed OTUs (20%), whereas cluster 536 is composed of OTUs assigned to Pirellulaceae (96%) and of a few unassigned OTUs (4%).Interestingly, those phyla are often cited when soil conditions vary (Edwards et al., 2015;Mukhtar et al., 2018;Sánchez-Marañón et al., 2017;Song et al., 2020;Sun et al., 2018;Wu et al., 2019), except Pirellulaceae, which might be an indicator of soil pH (Hermans et al., 2017).
In contrast, a comparison between bulk 1 and rhizosphere of A. halleri identi es 2-5 pseudo-taxa with a signi cant difference at each pseudo-taxonomic level (Table S6b), as does the same comparison for N. caerulescens samples (Table S6c).Among all those differences, Gemmatimonadetes, also varying in other studies (Sun et al., 2018;Yue et al., 2020;Zhao et al., 2020), Chloro exi and Armatimonadetes (along with unknown and unde ned organisms) are found in several clusters at various levels for A. halleri (cluster 435 at the pseudo-genus level, clusters 71 and 41 at pseudo-family, cluster 1 at pseudo-order and cluster 1 at pseudo-class).Cluster 536 at the pseudo-species level highlights again a difference in Pirellulaceae.Concerning the most signi cant differences found between bulk 1 and the rhizosphere of N. caerulescens, Chitinophagaceae compose the whole cluster 418 at the pseudo-species level.This phylum was previously identi ed as an indicator of aluminium concentration in soil (Hermans et al., 2017), but it was also shown to be mostly found in rhizosphere (Song et al., 2020).
These statistical analyses are globally in line with the results obtained at the OTU level (Fig. 4, Fig. 5) and corroborate the insights brought by our alpha-and beta-diversity analyses based on pseudo-taxonomy (Fig. S19, Fig. S18).While they demonstrate that relying on a pseudo-taxonomy can indeed be enlightening in a study like ours, they also con rm that the microbial communities of the two plant species studied in this work are essentially convergent.
Such an outcome might be explained either by the spatial proximity of the roots sampled in the eld and/or by the phylogenetic proximity of the corresponding plant species.Indeed, these two species belong to Brassicaceae and their rhizospheres might exert a similar selection on a possibly similar preexisting microbial community.A sampling of rhizospheres from non-hyperaccumulator plant species present on the same site could be useful to ascertain this hypothesis, such as Armeria maritima or Viola lutea.Another explanation could stand for the necessity of taking into account less abundant taxa.This would be in line with statistical differences obtained in beta-diversity (Fig. S18) and not with major taxa.Moreover, less abundant unknown and undetected taxa may be of importance in the speci city of microbiomes (Thomas and Segata, 2019).

Conclusions
The Prayon site near Liège (Belgium) is still highly contaminated with zinc and cadmium, and current values are even higher than those measured in 1974 33 .
Because of this pollution, only ve plant species are able to colonize the site.Among them are A. halleri and N. caerulescens, which both hyperaccumulate Zn and Cd.
Microbial diversity analyses revealed a larger number of OTUs in the rhizosphere of both A. halleri and N. caerulescens than in bulk soil samples (Fig. 7).The decreasing gradient of diversity, from rhizospheres to the bottom of the sampled cores, may be driven by a more favorable microenvironnement created by the roots than the highly polluted medium surrounding the roots (Hou et al., 2018;Praeg et al., 2019).Yet, this microenvironnement appears to impose selection constraints on the microbiota.This is evidenced in both plant species by a lower variability in the diversity found in the rhizosphere in comparison to the bulks, especially bulk 2 (which corresponds to the bottom of the core).
Taxonomic analyses at the phylum level were affected by a high proportion of unclassi ed and unknown OTUs, which were also observed at other taxonomic levels.Those OTUs lead to biases in the information provided by the taxonomic a liations.To bypass those biases, a pseudo-taxonomy was computed based on a global phylogenetic tree using all discovered OTUs.Consequently, alpha-and beta-diversity measures based on the pseudo-taxonomy showed a higher diversity in rhizospheres than in bulks, with bulk 2 being the most variable, as observed with the OTUs.Organisms at least partly implied in these differences belong to Chloro exi, Armatimonadetes, Pirellulaceae, Gemmatimonadetes and Chitinophagaceae, which were previously cited as varying with soil parameters.
In contrast, no signi cant difference between the rhizospheres of A. halleri and N. caerulescens could be found.This indicates that the rhizosphere microbiota of the two plant species is highly convergent.Nearly signi cant differences may be due to organisms belonging to Chloro exi, Armatimonadetes and Pirellulaceae, but if some compositional differences do exist, they are likely to involve unknown and unclassi ed organisms.
While the overall similarity of the rhizosphere of A. halleri and N. caerulescens could be explained by the proximity of the two plant species, whether in the eld or at the phylogenetic level, the composition of the soil could also be responsible.Indeed, the high level of zinc and cadmium contamination on the site likely to be the main factor in uencing the microbial diversity.Therefore, it would be interesting in further studies to compare the taxonomic composition of the rhizosphere in a single hyperaccumulator plant species but throughout different soils, polluted or not.

Declarations
Figures     Figure 6

Figure 1
Figure 1 Sampling strategy.(A) 6 sampling plots (rectangles) were chosen on the site of Prayon.(B) Each plot consisted of a surface of 1 m x 2 m, in which ve plants+cores were collected per species (Ah: A. halleri, Nc: N. caerulescens).(C) Each core was subdivided into four parts: the shoot, the rhizosphere, the upper part (bulk 1) and the bottom part (bulk 2).(D) Each part was pooled to constitute one sample per species and per plot.In total, 6 x 3 soil samples and 6 shoot samples were thus collected for each species.

Figure 2 Selection
Figure 2 Selection of soil parameters for A. halleri (Ah) and N. caerulescens (Nc) samples (see Fig. S4 for the full series of gure panels).Extractable (HCl) and exchangeable (BaCl2) element concentrations [mg kg-1 (DW)] in bulk samples are presented.Element concentration and water content values were measured from each soil sample.pH values are the means of ve measures per sample.Temperature (°C) values are the means of ve measures and bars are standard deviations.Numbers 1 to 6 represent the six sampling plots.ns: p > 0.05, *: p <= 0.05, **: p <= 0.01, ***: p <= 0.001, ****: p <= 0.0001, from ANOVA tests.Boxes represent the rst and third quartiles with the median in-between.Whiskers extends up to value 1.5 * interquartile range.Points are outliers.

Figure 3 Element
Figure 3Element concentrations in the shoot of plants collected in Prayon (Belgium).Values are from six sampling plots, each including a pool of ve plants, and are (a) above 6,000 mg kg-1 of shoot dry weight (DW), (b) between 1,000 and 6,000 mg kg-1 (DW) and (c) below 1,000 mg kg-1 (DW).Ah: A. halleri, Nc: N. caerulescens.Whiskers and other statistics details are as in Fig.2.

Figure 4 The
Figure 4The NMDS analysis, representing the taxonomic diversity, was performed on the relative abundances of OTUs identi ed in each of the 30 samples (a) or in each of the 10 rhizosphere samples (b).Sample labels (rhiz, bulk 1 and bulk 2; Ah: A. halleri, Nc: N. caerulescens) are shown at the centroids of their respective projection.1-6: sampling plots.

Figure 5
Figure 5 Distribution of alpha-and beta-diversity measures computed on OTUs.(a) Bias-corrected chao1 richness (alpha-diversity) are grouped by sample type (rhiz: rhizosphere, bulk 1 and bulk 2) and by plant species (Ah: A. halleri, Nc: N. caerulescens).(b) Euclidean distances (beta-diversity) are grouped by sample type comparison and by plant comparison.It takes into account the abundance of each OTU.Colors represent sampling plots (1-6).Grey (diff) represents comparison of samples from different sampling plots.Whiskers and other statistics details are as in Fig. 2.