16S Metagenomics investigation of Bacterial diversity and prediction of its functionalities in Moroccan phosphate mine ecosystem.

Native plants in extreme environments may harbor some unique microbial communities with particular functions to sustain their growth and tolerance to harsh conditions. The aim of this study was to investigate the bacterial communities proles in the Moroccan phosphate mine using 16S metagenomics sequencing. Taxonomic resolution of six hypervariable regions of the 16S rRNA was assessed in the collected samples and mock standard community. The bacterial functions enriched in the samples were also predicted to assess the contribution of the proposed low cost approach in the description of bacterial diversity and functionality in soil and plant. V4 and V67 regions performed better in the taxonomic resolution at different levels e.g., phylum, family and genus. However, V8 and V9 regions gave the lesser taxonomic resolution and V9 region identied only Proteobacteria phylum in the mock standard sample. Alpha and beta diversity analysis indicated low abundance of observed OTUs and diversity in the bulk samples compared with the rhizosphere samples of the native plants in the phosphate mine. Moreover, signicant (p = 0,021) differences in bacterial diversity were observed among at least three kind of samples : The bulk phosphate mine, the rhizosphere of the native wild plants of the phosphate mine and the rhizosphere of the wheat crop. The results indicated that both plant genotype


Abstract Background
Native plants in extreme environments may harbor some unique microbial communities with particular functions to sustain their growth and tolerance to harsh conditions. The aim of this study was to investigate the bacterial communities pro les in the Moroccan phosphate mine using 16S metagenomics sequencing. Taxonomic resolution of six hypervariable regions of the 16S rRNA was assessed in the collected samples and mock standard community. The bacterial functions enriched in the samples were also predicted to assess the contribution of the proposed low cost approach in the description of bacterial diversity and functionality in soil and plant.

Methods
The Ion 16S™ metagenomics kit was used to compare the discriminatory potential of the six targeted hypervariable regions of the 16S rRNA gene in the description of the bacterial communities. Alpha and beta diversity analysis were also performed accordingly in the studied samples. The later were collected from the bulk phosphate mine and the soils attached to roots of three different wild plants, that natively grow in this environment. Samples from the rhizosphere of wheat plants in cropping system were also studied. Actinobacteria related sequences were compared between the samples to investigate their possible origin between the bulk mine and the wild plants rhizosphere. The prediction of 82 bacterial functions related to nitrogen and phosphate metabolism along with stress was performed using Tax4fun bioinformatics tool.

Results
The rhizosphere of the three wild plants in the Moroccan phosphate mine is characterized by interesting bacterial diversity including Proteobacteria (62,24%, 71,15% and 65,61%), Actinobacteria (22,53%, 15,24%, 22,30%), Bacteroidetes (7,57% ; 4,23% ; 7,63%), and Firmicutes (5,82%; 1,17% ; 2,83%). Broad taxonomic diversity of minor phyla was also found in the native plants samples and included Acidobacteria, Armatimonadetes, Chloro exi, Cyanobacteria, Gemmatimonadetes and Verrucomicrobia. The bulk phosphate mine samples were dominated by Actinobacteria with average relative abundance of 97,73% and Proteobacteria was Less abundant phylum in these samples. V3, V4 and V67 regions performed better in the taxonomic resolution at different levels e.g., phylum, family and genus. However, V8 and V9 regions gave the lesser taxonomic resolution and V9 region identi ed only Proteobacteria phylum in the mock standard sample. Alpha and beta diversity analysis indicated low abundance of observed OTUs and diversity in the bulk samples compared with the rhizosphere samples of the native plants in the phosphate mine. Moreover, signi cant (p = 0,021) differences in bacterial diversity were observed among at least three kind of samples : The bulk phosphate mine, the rhizosphere of the native wild plants of the phosphate mine and the rhizosphere of the wheat crop. The results indicated that both plant genotype and mainly soil conditions may be involved in the shaping of bacterial diversity. Such indication was also con rmed by the prediction of functional pro les that showed enrichment of many functions related to biological nitrogen xation in the rhizosphere of native plants and the stress related functions in the bulk phosphate mine in comparison with the wheat rhizosphere samples

Conclusion
The proposed amplicon sequencing approach combined to bioinformatic prediction of bacterial functions could be considered as low-cost method to study the bacterial communities in soils and plants.
The application of such approach to samples from the phosphate mines and wild plants revealed the need to consider at least V3, V4 and V6-7 hypervariable regions of the 16S rRNA gene in the taxonomic resolution of bacterial diversity in soil samples. This approach revealed also interesting bacterial diversity governed by the plant genotypes and the local conditions with interesting predicted functionalities involved in plant growth and development and stress tolerance in extreme environments. Further studies could target some interesting identi ed phyla and predicted bacterial functionalities, such studies could also deeply investigate the origin of the bacterial diversity in these native plants to develop new generation of bacterial inoculants for sustainable agriculture.

Background
Plant associated microbiota have attracted many researchers worldwide and continue to be one of the most studied research areas in microbiology over the last decades. In a recent revision and review of the concept de nition and challenges in microbiome research, Berg et al [1] proposed an explicit de nition for the term microbiome based on the previous de nitions with some newly introduced amendments. The proposed de nition can be summarized as the theatre of activities of microorganisms living in a giving ecosystem. According to this de nition, plant microbiomes are not passive actors [2]. They can, rather, interact with host development, physiology and systematic defense (e.g., mycorrhizal symbiosis with wheat plants [3]), induce disease resistance (e.g. bacterial endophyte Enterobacter sp. "M6" that confers resistance to fungal pathogen Fusarium graminearum in Finger Millet [4]), confer tness advantages to plant host (e.g. Association of microbiome phenotypes with Arabidopsis genes involved in immunity, cellwall integrity, root and root hair development [5]) and increase plant tolerance to stress and drought [6-8].
In fact, plant Microbiome was identi ed as a key for the next green revolution [9], and numerous products and microbiome management strategies were developed in agriculture including (i) microbiome transplants, (ii) microbial inoculants, (iii) microbial and plant extracts, (iv) methods to change environmental conditions and (V) microbiome engineering [10][11][12] The analysis of the plant associated microorganisms, their composition and their activities and functions are still challenging. Traditionally, microbiological investigations of microbiota relies on the isolation and the cultivation of single isolates followed by morphological, biochemical and molecular characterizations. However, despite the continuous improvements in cultivation-dependent techniques, an often cited estimate indicated that only 1% of extant bacteria can be isolated in culture and as much as 99% or more of the microorganisms present in many natural environments are not readily culturable [13,14]. Although this traditional approach is still required in order to isolate single strains that can be applied as inoculum, the newly developed next-generation sequencing (NGS) molecular methods are, nowadays, widely used to pro le microbial communities and their dynamic in different plants and plant compartments [15]. Sequencing platforms (e.g., Illumina®, PacBio®, Oxford nanopore® and Ion Personal Genome Machine (PGM) ®) have known interesting advances in wet laboratory protocols, kits and bioinformatics pipelines in plant Microbiome related applications and studies. Such advances allow cost reduction, time preservation and high accuracy in deciphering the structure and function of microbial communities in plant microbiome analysis [16][17]. The microbiome analysis with NGS platforms relies mainly on the amplicon sequencing methods that target the 16S rRNA gene [18]. Its sequences serve as a marker of choice in order to infer the microbiome composition. 16S rRNA gene is universally present in bacteria and archaebacteria, it contains conserved, variable and hypervariable regions allowing taxonomical classi cation and separation. These characteristics of the 16S rRNA gene allow the development of numerous primer combinations that are used to amplify hypervariable regions and to generate amplicon of variable lengths suitable for subsequent sequencing with different NGS platforms (e.g., Illumina, Oxford Nanopore, Paci c Biosciences, Ion PGM). Illumina sequencing platforms are considered as the major player among NGS platforms and the MiSeq variant is considered as the default choice for metagenomics studies [18,19]. In the illumina systems, the variable regions V3 and V4 of the 16S rRNA are predominantly sequenced and analyzed in microbial diversity studies. Similarly, Thermo scienti c TM developed Ion 16S TM metagenomics kit that has the ability to amplify and sequence six hypervariable regions (V2, V3, V4, V6-7, V8 and V9) simultaneously [20]. However, this NGS system is rarely used in metagenomics analysis of 16S rRNA gene mainly in the context of plant -microbiome interactions. In this work we used the Ion 16S TM metagenomics approach to investigate the microbial diversity in the rhizosphere and bulk soil of native plants in Moroccan phosphate mine.
Plants are very diverse and occupy most of terrestrial ecosystems, where they play a key role as primary producers. They are also present in extreme and uncommon environments such as deserts, contaminated soils and ores. Many researchers postulated that native plants from these extreme environments may harbor some unique microbial communities with particular functions to sustain their growth and tolerance to harsh conditions [21][22]. Investigations of the microbiome of plants growing in unfavorable environments and stresses may provide insight into microbial and plant traits that allow them to alleviate such stresses (e.g., drought, salinity, pests and pathogens). In this sense, Pérez-Jaramillo et al. [23] proposed a 'back to the roots' framework that comprises the exploration of the microbiomes of native plants vegetation growing in their native habitats for the identi cation of microbial traits to reinstate bene cial association that may have been lost during plant domestication. For examples, in Chile, the rhizospheres of Atriplex sp. and Stipa sp. (shrubs) grown in the Atacama Desert were dominated by Gammaproteobacteria as revealed by 454-pyrosequencing studies [21]. Zhang et al. [22] described and compared abundance and composition of bacterial communities associated with leaves and roots of plants grown in Atacama desert ( Distichlis spicate (Poacea) and Pluchea absinthioides (Asteraceae)).
Authors revealed the abundance of Proteobacteria, Firmicutes, Actinobacteria and Bacteroidetes among the endophytic bacterial communities. Interestingly, most of operational taxonomic units (OTUs) were not shared between endospheres of these indigenous plants indicating the effect of plant genotype on the bacterial communities distribution. The ability of some native isolates from Atacama desert to act as plant growth promoting bacteria (PGPB) was con rmed in wheat plants, they also showed greater protection against salt stress [24].
Moroccan phosphate mines, mainly in the center of Morocco (Ben Guerir Location), are characterized by high temperature in the summer, aridity and rains scarcity, very low organic matter and high phosphate mineral contents. This environment can be considered as extreme and the native plants growing in this habitat could represent the optimal place to look for PGPB with interesting and unique traits that confer resistance to crops. Unfortunately, very few microbiological studies on phosphate mines have been carried out. In one of our recent studies, we isolated and characterized bacilli strains from rock phosphate mines that showed interesting plant growth promotion traits such as phosphate solubilization, phytohormone and ACC-deaminase production [25]. Previously, Hamdali et al. [26,27] studied Actinobacteria (e.g., Streptomyces griseus and Micromonospora antiaca) native in Moroccan phosphate mines that showed abilities to solubilize rock phosphate and other multiple plant growth properties under in vitro and greenhouse conditions. The above mentioned studies followed the traditional cultivationdependent techniques, and to the best of our knowledge there is no study applying the NGS methods to describe the microbial communities in this environment. In this work, we proposed the metagenomics analysis of the 16S rRNA gene in the rhizosphere of wild native plants in phosphate mines along with the bulk mine soil. We compared the composition and abundance of the analyzed bacterial communities with the wheat rhizosphere in cropping natural system. We used in this study the Ion 16S TM metagenomics kit (Thermo Fisher Scienti c Inc., Genome Biotechnologies, Morocco) in order to compare the discriminatory potential of the six targeted hypervariable regions of the 16S rRNA gene in the description of the bacterial communities in the studied samples. Barb et al. [28] analyzed the six hypervariable regions of the 16S marker gene by using the Ion 16S TM metagenomics kit to study the oral microbiome diversity. They developed a novel analytical pipeline with the V4 region enabled the best resolution of bacterial communities pro ling up to both family and genus levels. In a recent study, Tamošiūnė et al. [29] assessed the effect of cold Plasma treatment on the plant-associated Microbiome using the Ion 16S TM metagenomics kit which seems to be the rst study applying this kit on plantassociated microorganisms. However in this study, authors did not give indication related to the best 16S rRNA hypervariable region that gives a robust identi cation and classi cation of taxa.
In this work, we analyzed the effect of the six hypervariable regions in the taxonomic assignment, abundance and distribution of the bacterial communities in the studied samples. In this approach, the prediction of the bacterial functionalities based on 16S rRNA pro les was performed using Tax4fun software package available in R environment [30]. With this bioinformatics prediction, we gave insights into the functional capabilities of the microbial communities associated with natives plants in phosphate mines, in a cost-effective way in comparison with the shotgun metagenomics approach.

Results
Sequencing yield and pre-processing To investigate the microbiome associated with native plants that grow in phosphate mines in Morocco and its comparison with the phosphate bulk soil samples and the rhizosphere of wheat crop, a total of 15 samples were subjected to 16S rRNA metagenomics analysis using Ion Torrent PGM TM sequencing platform. Total obtained reads accounted for 2 782 897. they were ltered for chimeras, quality score and copy number. Reads were classi ed according to 16S rRNA variable regions for each sample and only reads with high number of copies were mapped against 16S rRNA databases (Supplementary materials   Table S1 and Table S2).
Taxonomic resolution of the primers targeting variable regions of the 16S rRNA Variable regions of the 16S rRNA were ampli ed in the DNA derived from the studied samples and the ZymoBIOMICS community standard using two separate primer sets with the Ion 16S TM metagenomics kit. Read Classi cation to taxonomic pro les at four levels e.g., phylum, family, genus and species was compared between the consensus and each region according to the number of taxa identi ed. In this section, results obtained for the standard ZymoBIOMICS DNA were also presented. Figure 1 showed that at the phylum level, V2, V3, V4, V67 and V8 performed well for the identi cation of the phyla present in the ZymoBIOMICS standard. However, V9 region identi ed only Proteobacteria phylum. Taxonomic resolution at more lower levels became challenging and V9 region gave less resolution up to family, genus and species levels. V67 and V8 regions gave better resolution than V9, nevertheless, new families and genera were found with these regions that are not present in the standard. V2, V3 and V4 regions gave the best resolution up to both family and genus levels ( Figure 1).
The same analysis was performed for the collected soil samples ( Figure 2). However, since the bacterial composition of these samples is unknown compared to the standard ZymoBIOMICS, the maximum number of each taxonomic level present in each sample was taken as the target resolution. For each region, resolution percentages were calculated accordingly and presented as heatmap. The overall analysis of the gure 2 indicated variability in the taxonomic resolution of different 16S rRNA regions according to the taxonomic level and the sample types. However, some regions gave the lesser taxonomic resolution e.g., V9 and V2 whereas V3, V4, V67 and V8 performed better in the taxonomic resolution mainly at the phylum level. At higher taxonomic levels e.g., Genus and Species, the number of taxa recovered by every hypervariable region decreased for almost all the regions except the V3 region.
Bacterial diversity analysis by consensus sequences in the studied samples: 16S Metagenomics analysis included samples from bulk phosphate mine, samples from the rhizosphere of three native wild plants in phosphate mine and samples from the wheat rhizosphere. Figure 3 presented the distribution and abundance of bacterial phyla, families and genera in the studied samples using the consensus sequences from the all variable regions of the 16s rRNA gene. Results of the bacterial composition clearly showed differences according to sample types and the plant species. Bacterial composition of the samples obtained from bulk phosphate mine in the vicinity of the wild plants was dominated by Actinobacteria with average relative abundance of 97,73% ( gure 3a). Proteobacteria was Less abundant in these samples compared to Actinobacteria. At the family and genus levels, an average of 94,54% and 94,29% of OTUs were assigned to Nocardioidaceae and Nocardioides, respectively in these samples. This result indicated the selective presence and dominance of Nocardioides members in this particular environment among the bacterial diversity. As showed in gure 3, the bacterial composition was completely different in the rhizosphere samples of the three native plants. Proteobacteria phylum was almost enriched than other phyla with relative abundance of 62,24%, 71,15% and 65,61%, in the rhizosphere Alternanthera caracasana (NP1); Digitaria sanguinalis (NP2) and Dittrichia Viscosa (NP3), respectively. Within this phylum, main representative classes were Betaproteobacteria (7,1%; 40,38%; 23,15% in NP1, NP2 and NP3, respectively), Alphaproteobacteria (26,84%; 19,65% ; 22,85% in NP1, NP2 and NP3, respectively) and Gammaproteobacteria (28,05%; 7,85%; 15,68% in NP1, NP2 and NP3, respectively). Deltaproteobacteria were less abundant. High relative abundances of Actinobacteria (22,53%, 15,24%, 22,30% in NP1, NP2 and NP3, respectively) were also found in these native plant samples, however, at lower level, Actinobacteria were mainly represented by Actinomycetales order (21,81% ; 12,58; 19,17% in NP1, NP2 and NP3, respectively) with wide taxonomic diversity at family and genus level. It is noteworthy that the relative abundance of Nocardioidaceae family was signi cantly reduced in these samples in comparison with the bulk phosphate mine samples ( Figure 3). Sequences assigned to Actinobacteria were further analyzed in this study. Other major phyla that were found in these samples were Bacteroidetes (7,57% ; 4,23% ; 7,63% in NP1, NP2 and NP3, respectively), and Firmicutes (5,82%; 1,17% ; 2,83% in NP1, NP2 and NP3, respectively). Broad taxonomic diversity of minor phyla was also found in the native plants samples and included Acidobacteria, Armatimonadetes, Chloro exi, Cyanobacteria, Gemmatimonadetes and Verrucomicrobia ( Figure 3). The wheat rhizosphere samples showed a relatively similar pattern as the bacterial composition of the rhizosphere of native plants. In these samples, the assignment of taxonomic a liation revealed high relative abundances of Proteobacteria (57,42%), Actinobacteria (35,24%), Firmicutes (5,52%) and Nitrospirae (2,32%). Broad taxonomic diversity of minor phyla was also found in the wheat rhizosphere samples. Actinobacteria were mainly represented by Actinomycetales and Gaiellales orders with 15,53% and 9,36%, respectively.
Alpha diversity measures (Observed OTUs, Chao, ACE and Shannon) and rarefaction curves are presented in Figure 4 and supplementary material Figure S1. Rarefaction analysis enabled us to gauge the extent to which total diversity had been recovered at each sample and indicated that the extent of bacterial diversity was captured by the sequencing approach. Figure 4 showed signi cantly lower diversity and OTUs number in the bulk mine samples. This nding was expected, because of the extreme conditions met in such habitat. Interestingly, the native plants in the phosphate mine showed higher observed OTUs, species richness and species diversity and evenness in comparison with the wheat crop rhizosphere samples, even if the observed differences were statistically not signi cant (Figure 4). Higher alpha diversity measures were recorded for the rhizosphere samples of the phosphate mine native plant Dittrichia Viscosa (NP3), this increase was statistically signi cant with other native plants NP1 and NP2 samples ( Figure 4). In contrast to other samples, the wheat rhizosphere samples were characterized by variable intra-diversity, which makes the box-plot range larger than in other samples. Some replicates were highly diverse and rich in terms of species, while others were characterized by their lower richness in bacterial communities (Figure 4).
Beta diversity measures included Bray Curtis distance unweighted and weighted UniFrac distances and jaccard dissimilatory. PCoA of the latter was presented in gure 5 and clearly indicated differences in bacterial diversity among at least three kind of samples : The bulk phosphate mine, the rhizosphere of the native wild plants of the phosphate mine and the rhizosphere of the wheat crop. Statistical analyses using PERMANOVA revealed signi cant differences between the three groups of samples (p=0,021). This nding indicated that the plant genotypes had limited in uence in the shaping the bacterial diversity in this particular environment.
The distribution of shared and unique OTUs among bacterial communities in the studied samples was presented in gure 6. High number of OTUs was shared between the samples at the phylum level such as Actinobacteria, Proteobacteria,Firmicutes and Bacteroidetes (Figure 6a). The shared OTUs between the rhizosphere samples (native plants and wheat crops) included Armatimonadetes, Planctomycetes and Verrucomicrobia, whereas only Nitrospirae phylum was shared between the bulk mine samples and the rhizosphere of native plants. Similarly, at lower taxonomic levels e.g., family ( Figure 6b) and genus (Figure 6c), high number of OTUs was shared between the samples (28 OTUs among 86 and 33 among 317 at the family and genus levels, respectively). Interestingly, high number of OTUs at the family level (27) were shared between the rhizosphere samples of native plants and wheat crops which included some plant-associated bacteria such as Rhizobiaceae and Phyllobacteriaceae, and indicated the effect of the plant rhizosphere in the interactions with bacterial diversity. The same statement is true for the shared OTUs at the genus level with high number among the plant samples ( gure 6c) Actinobacteria phylogenetic analysis : According to taxonomy and representative sequences tables of Qiime2, the number of sequences assigned to Actinobacteria varied between samples. Bulk phosphate mine samples accounted for 168 sequences assigned to Actinobacteria, whereas 336 sequences were assigned to Actinobacteria in the rhizosphere samples of the wild plants in phosphate mine. Figure 7 showed the phylogenetic relatedness and the inferred evolutionary relationships between the Actinobacteria sequences in the studied samples and revealed that Actinobacteria in Bulk RP were phylogenetically different and separated from those in native plant samples. Actinobacteria related sequences in the bulk phosphate mine samples yielded three different clades whereas, those in native plants samples were more diverse with many branches. Actinobacteria related Sequences in NP1 and NP3 were the most closely related among all samples with a common node of 72% (Figure 7). Actinobacteria in the Bulk phosphate samples were less evolutive than the sequences in the native plants samples which could be maybe related to adaptation and coevolution process between the bacteria and the plants All together these results indicated the selection of the Actinobacteria by plants in their rhizosphere and the absence of possible migration of some Actinobacteria from the bulk mine samples to the plants rhizosphere in their surrounding environment.
Prediction of Bacterial functions enriched in the samples : The Bioinformatic tool Tax4fun was used in this study to predict bacterial functions involved in nitrogen and phosphorus metabolism and stress across the studied samples (Figure 8). Figure 8a Figure 8C showed the enrichment pro les of bacterial functions related to stresses, it indicated that the Na+/H+ antiporters NhaC and NhaB, GABA permease, anion: H+ symporter DASS family, 1aminocyclopropane-1-acyteltransferase, lipopolysaccharide O-acyteltransferase, polysaccharides transporter PST family, lipopolysaccharides export system permease protein, lipopolysaccharides biosynthesis protein WzzE, acyl homoserine lactone synthase and acyl homoserine lactone e ux proteins, were highly enriched in phosphate rock samples compared to rhizospheric soil of native plants in phosphate mine and the rhizospheric soil of wheat crop. On the other hand, all of these bacterial functions are higher in the rhizospheric soil of native plants compared to the rhizospheric soil of wheat crops. On the other hand, the NhaC family and phosphate:Na+ symporter were enriched in the rhizospheric soil of native plants than in bulk phosphate samples.

Discussion
In this work we adopted a cost effective approach to decipher the bacterial communities distributions and functions in soil samples. We used the Ion 16S TM metagenomics approach to assess the six hypervariable regions of the 16S rRNA (V2, V3, V4, V6-7, V8 and V9) simultaneously, in the taxonomic assignment, abundance and distribution of the bacterial communities in the studied samples. The prediction of the bacterial functionalities based on 16S rRNA pro les was performed using Tax4fun software package to highlight any possible interesting bacterial functionalities that open perspectives for in depth future soil microbiome studies in phosphate mine.
With the advancement, the accessibility and the affordability of sequencing technologies, the use of high throughput sequencing platforms in soil microbial ecology has become popular and revolutionized our understanding of the soil -plant -microorganisms interactions. However, Nannipieri et al. [31] discussed and criticized the use of DNA-based tools in the determination of soil microbial diversity and functionalities. Schöler et al. [32] highlighted in their opinion paper biases and pitfalls in the steps of generation, processing and interpretation of 16S rRNA sequencing data in the eld of soil and plant microbiome studies. They clearly recommended the sequencing of the entire 16S rRNA gene using some new long read sequencing platforms in comparison to shorter amplicons which are mostly generated by Illumina sequencing.
To our knowledge, the present work is the rst study dealing with soil bacterial DNA that used simultaneously processed multiple hypervariable regions of the 16S rRNA gene. Barb et al. [28] used multiple hypervariable regions of the 16S rRNA gene to study the human microbiome and concluded that the use of multiple regions improve the results of identifying bacteria that compose complex communities such as those inhabiting human. The Ion 16S™ Metagenomics kit was also used to evaluate the performance of individual hypervariable regions of the 16S rRNA gene, for capturing human vaginal microbiota [33]. Authors of this study indicated that analyses using the V3 region generally indicated the highest bacterial diversity followed by the V6-V7 and V4 regions, while the V9 region gave the lowest bacterial resolution. With a similar way, we used in our study the Ion 16S™ Metagenomics Kit with both genomic DNA from ZymoBIOMICS mock standard sample and DNA extracted from soil samples originated from different habitat mainly the Moroccan phosphate mine and native plants in the ore. The obtained results in the mock samples presented in gure 1 were in accordance with the ndings of Barb et al. [28] and Sirichoat et al. [33], since the hypervariable regions V2, V3, V4 and V6-7 regions gave the best resolution up to both family and genus levels. V9 region gave less resolution and identi ed only Proteobacteria phylum. However, the percentage of taxa resolved by each hypervariable region of the 16S rRNA in the DNA of the studied samples is variable according to the taxonomic level and the sample origin. In this case, V3, V4 and V67 performed better in the taxonomic resolution at different levels. On the other hand, V9 and V8 gave the lesser taxonomic resolution (Figure 2). This nding could also corroborate the suggestion of Barb et al. [28] which stated that the Ion 16S™ Metagenomics Kit does not perform well on the end of the 16S rRNA gene. Fuks et al. [34] proposed a method called Short MUltiple Regions Framework (SMURF), to combine sequencing results from different PCR-ampli ed regions to provide one coherent pro ling in order to improve the resolution of bacterial pro ling. Recent study benchmarked a new 16S full-length-based synthetic long-read (sFL16S) technology that enables longread sequencing by using an existing Illumina short-read sequencer and the general 16S rRNA amplicon (V3-V4) sequencing methods [35]. These studies, clearly indicated and recommended the necessity to consider the entire hypervariable regions rather than partial analysis of the 16S rRNA gene in the characterization of the diversity and composition of the microbial communities inhabiting specimen.
In our study, we characterized the bacterial communities structure in the phosphate ore (Bulk_RP) and we compare this bacterial diversity with the rhizosphere samples of some spontaneous wild plants within the phosphate ore (Alternanthera caracasana (NP1); Digitaria sanguinalis (NP2) and Dittrichia Viscosa (NP3)) along with the rhizosphere of wheat in cropping system (WR). We used the consensus analysis of six hypervariable regions of the 16S rRNA gene thanks to the Ion 16S TM metagenomics protocol in the taxonomic units assignments. Interesting ndings are provided in this study, the Actinobacteria accounted for more than 97% in the ore bulk samples with signi cantly lower bacterial diversity expressed by the alpha diversity indices (Figure 4), and the enrichment of particular family and genus e.g., Nocardioides. Actinobacteria are generally ubiquitous and have the ability to survive under extreme conditions. They form mycelia and/or produce spores to reach water and nutrients and to survive in these environments. They can also in uence the weathering of mineral elements [36]. Many reports describing the bacterial diversity in extreme habitats e.g., Chilean Deserts, Arabian peninsula desert, alpine pants and contaminated mine tailings are in accordance with our study and found Actinobacteria among the dominant phyla in these habitats [21,[37][38][39]. Members of Nocardioides genus have also been isolated using cultivation-based approach from some extreme niches in Morocco and the obtained isolates showed plant growth-promoting activities e.g., phosphorus solubilization, potassium solubilization, siderophores and indole-3-acetic acid production [40]. However, Na s et al. [40] speci ed that Nocardioides genus was rare among the other Actinobacteria genera present in these extreme niches which needs to be evaluated in light of our results showing that this genus was highly abundant.
A shift in the bacterial communities composition was observed between the bulk rock phosphate mine samples and the rhizosphere of native plants grown in this niche (Figure 3). This nding con rmed the roles of root exudates in recruiting and shaping the plant rhizosphere microbiome which is intensively reported [41,42]. This shift was translated by the enrichment of Betaproteobacteria, Alphaproteobacteria, Gammaproteobacteria, Bacteroidetes and Firmicutes with the appearance of new phyla mainly Verrucomicrobia (Supplementary materials Figure S2). Actinobacteria were also found with high relative abundance in the native plants rhizosphere however the Nocardioides genus was lower than newly enriched taxa within Actinomycetales and Gaiellales orders ( Figure 3). Alpha diversity measures presented in gure 4 indicated low abundance of observed OTUs and diversity in the bulk samples compared to the rhizosphere samples of the natives plants in the phosphate mine. Generally, studies indicated that the bulk soil under "normal conditions" is more abundant and diverse in OTUs than the rhizosphere part and plants recruit best adapted taxa [43][44][45]. However, Zhang et al. [46] reported similar results in the study of rhizosphere microbial communities diversity of seagrasses Thalassia hemprichii and Enhalus acoroides in the Xincun Bay in China. One explanation provided by the authors is that bulk sediments are characterized by lower organic matter contents compared to seagrass beds. Lower organic matter and lower humidity are also the case of phosphate mine ore in the studied area compared to the rhizoplane where plants create favorable microenvironment for higher bacterial diversity. Many factors are involved in the plant recruitment of microbial communities such as soil type, edaphic conditions and the host plant genotype [47]. Results of beta diversity presented in gure 5 indicated signi cantly separated bacterial diversity between the native plants, the bulk mine and the wheat crop with insigni cant differences between the three studied native plants. This result may suggest the in uence of the soil conditions rather than the genotype of the host plant in the shaping of bacterial communities. Such suggestion is in accordance with the conclusion of Eida et al. [39] who showed that the host desert plants phylogeny aligns to a lesser degree with the rhizosphere bacterial communities which are largely determined by the soil conditions. In this particular habitat, selective pressures are exerted to plants with adapted genotype and their associated bacterial communities to choose the best adapted phyla to help them thrive in such conditions [45]. Furthermore, it was intrigant to investigate the occurrence of some OTUs in the rhizosphere of native plants through the phylogenetic analysis of Actinobacteria related sequences. We found interesting separation of clades according to the sample origins (bulk phosphate vs native plants rhizosphere) ( gure 7) which opens hypothesis to explain the origin of Actinobacteria diversity in the studied samples mainly in the rhizosphere of the ore native plants. We reported high relative abundance of Nocardioides genus in the bulk samples exceeding 90% and which drastically decreased in the rhizosphere samples of the native plants, where we found more diversity within the Actinomycetales order. This diversity could be attributed to a plants-driven factors that determine the dominant bacterial populations in their rhizosphere [48]. Another explanation could be the occurrence of distinct ecotypes in the wild plants rhizosphere that have co-evolved. In fact, Lopes et al. [49,50] showed in two different speci c studies distinct populations of Pseudomonas koreensis and Pseudomonas putida between sugarcane rhizosphere or bulk soil and concluded that the divergence of these populations might have resulted from the differential selective pressures posed by the bulk soil versus rhizosphere habitats even in the populations of same species. Furthermore, the effect of microbial seed load on rhizosphere and endosphere bacterial communities was investigated in wheat by Kavamura et al. [51]. In fact, plant seeds harbor diverse microbial communities and their composition is determined by plant genotype, environment, and management practices [52]. The seed borne microbiome transmission is still poorly investigated and links between seed and soil microbiomes are not fully understood. Such transmission could also explain the diversity of bacterial communities in the rhizosphere of the native plants studied in the phosphate ore. In any case, further speci c studies are needed to investigate the occurrence of bacterial diversity in different part of plants and the surrounding environment mainly in extreme contexts.
In this study, we predicted the functional pro les of bacterial taxa based on 16S rRNA sequencing data using the Tax4Fun bioinformatic tool (Figure 8). This approach could be adapted in low cost studies based on the 16S metagenomics sequencing rather than the expensive shotgun metagenomics. It could also give insights into some particular bacterial functions under different context and conditions that could be targeted in subsequent studies. The obtained results, presented in gure 8, showed some interesting ndings. For example, the enrichment of the nitrogenase functions and the nitrogenase molybdenum-iron (MoFe protein) in the rhizosphere of native plants in the nitrogen-deprived phosphate mines, explained the recruitment by these plants of atmospheric nitrogen-xing bacteria to ensure its nitrogen requirements. Such functions were not found in rock phosphate bulk samples without plants, which strengthened the recruitment hypothesis. In wheat plants rhizosphere, these functions were less enriched since this environment is rich in nitrogen coming from fertilizers. The nitrogen xation process is based on the transfer of electrons from the Fe protein to FeMo-co, in which a nitrogen molecule is converted to two ammonia molecules. The biosynthetic process of all proteins involved in this mechanism such as FeMo-co are extremely vulnerable to environmental and endogenous oxygen. The aerobic and micro-aerobic nitrogen-xing organisms need to create strict anaerobic microenvironments in the cell to facilitate active nitrogenase functions. To meet this need, special accessory enzymes / proteins are required for the biosynthesis of clusters and mainly FeMo-co such as the genes products of nifH, nifQ, nifB, nifE, and nifN and nitrogen xation proteins NifX involved in e cient transfer processes of NifBco [53,54]. In this sense, NifZ is involved in the maturation of the P-cluster, the second cluster of catalyzing the biological nitrogen xation reaction by nitrogenase. On the other hand, the expression of the nitrogen regulatory proteins PII 1 and PII 2 are linked to the regulation of carbon metabolism under conditions of low nitrogen availability. In bacteria, PII proteins control target proteins in response to cellular ATP/ADP levels and the 2-oxoglutarate status, thereby coordinating the cellular carbon/nitrogen balance. In addition to their involvement in the regulation of carbon metabolism, these proteins are able to control the activity of acetyl-CoA carboxylase and glutamine synthetase (GS) in response to the availability of nitrogen sources [55]. The resulted prediction functions explain the hypothesis of recruitment of nitrogen-xing bacteria by native plants under nitrogen starvation, while ensuring the control of carbon metabolism and energy for proper functioning of nitrogenase under these stress conditions. Similarly, prediction of bacterial functions related to the solubilization and assimilation of phosphate revealed the enrichment of speci c functions according to samples nature. In rock phosphate bulk samples, several functionalities were enriched such as low-a nity inorganic phosphate transporters, acid phosphatase, 3-phytase, glucose-1-phosphatase, alkaline phosphatase, phospholipase, phosphonoacetaldehyde hydrolase, as well as phosphate regulon PhoB (Figure 8b). Enrichment of these functions could be explained by the low availability of inorganic phosphate in the rock phosphate, and the bacterial populations showed these adaptation mechanisms under this harsh environment such as the activation of low-a nity inorganic phosphate transporters. At the same time, the activation of the solubilization potential of phosphate via the expression of the gene products of phosphatases such as acid phosphatase, 3-phytase, glucose-1-phosphatase, alkaline phosphatase, phospholipase, phosphonoacetaldehyde hydrolase. However, due to the dephosphorylating action, acid phosphatase activity could indirectly in uence inorganic P solubilization by lowering the medium pH [56]. In the same way, phosphate regulon PhoB which is induced by phosphate deprivation was also enriched in the phosphate rock samples. The Phosphate (Pho) regulon is a bacterial regulatory mechanism used for the conservation and management of inorganic phosphate within the cell [57]. In the other hand, Bacterial high a nity transport systems are involved in active transport of solutes across the cytoplasmic membrane. Most of the bacterial ATP-binding cassette importers are composed of one or two transmembrane permease proteins, one or two nucleotide-binding proteins and a highly speci c periplasmic solute-binding protein [58]. This explained the prediction results of the enrichment of functions involved in activation of active transport in the case of wheat crops, such as the phosphate transporters, as well as, the enrichment of The inorganic phosphate transporters (PiT) family which appear to catalyze inorganic phosphate (Pi).
Results of the stress related bacterial functions enrichment showed that they were higher in the bulk phosphate mine samples and in the rhizosphere of native plants compared to wheat crop. Looking at the role of these stress related functions presented in gure 8C, we nd that the expression of these functional genes is linked to stress tolerance. Na + and H + are the most common ions and they play primary roles in cell physiology: both are important in cell bioenergetics. Indeed, when the concentration of these ions becomes too high or too low they turn into potent stressors to cells. every cell has a very e cient homeostatic mechanism for these ions. Proteins that play a primary role in this homeostatic mechanism are the Na + /H + antiporters. NhaC, the key Na + / H + antiporter, play a principal role for regulation of alkaline phosphatase biosynthesis, which has the role of solubilization of phosphate in alkaline conditions of rock phosphate. In the case of the rhizospheric soil of native plants existing in the phosphate mine, the low enrichment of this function could be explained by the change in soil conditions thanks to the root exudates rich in organic acid, amino acids and others metabolites and organic carbon.
This explains the increase in microbial diversity in this rhizospheric soil of these native plants compared to that of phosphate mine presented in alpha diversity measures. Low-molecular-weight organic compounds in root exudates play a key role in plant-microorganism interactions by in uencing the structure and function of soil microbial communities. Under these conditions of nitrogen and carbon stress in the phosphate rock, the GABA permease function was highly enriched. This enzyme is responsible for the release of GABA which plays a very important role in the regulation of nitrogen uptake, and at the same time considered as carbon source under stress conditions [59]. On the other hand, bacterial surface molecules known as lipopolysaccharide (LPS), produced by most Gram-negative bacteria, create a permeability barrier at the cell surface and is a main contributor to the innate resistance To our knowledge, the present work is the rst study dealing with soil bacterial DNA that used simultaneously processed multiple hypervariable regions of the 16S rRNA gene. The proposed amplicon sequencing approach combined to bioinformatic prediction of bacterial functions could be considered as low cost method to study the bacterial communities in soils and plants. The application of such approach to samples from the phosphate mines soil and wild plants in this environment revealed the need to consider multiple hypervariable regions of the 16S rRNA gene. In the taxonomic resolution of bacterial diversity in soil and plant samples we concluded the need to consider the V3 -V4 and V6-7 regions rather than the widely used V3 -V4 primers and the V9 region identi ed only Proteobacteria phylum. Basecalling and Bioinformatics analysis of sequencing data Basecalling and demultiplexing were performed with the Torrent Suite™ Software version 5.0.5 (Life Technologies) using recommended parameter settings. DNA sequencing data were collected in FASTQ format, analyzed using QIIME 2 v2020.8 [62]. Brie y, Raw torrent ion sequence reads were imported from QIIME2 using the "Sample Data" (Sequences with Quality). Then denoised and ltered with the dada2 pipeline to eliminate noisy and chimeric sequences, demultiplexed, and construction of noised paired-end sequences. Taxonomic identi cation was performed using Curated MicroSEQ(R) 16S Reference Library v2013.1, Curated Greengenes v13.5Greengenes v.13.5 and Silva databases, threshold value for percentage identity for genus and species ID was 97%.
All the basic measures used in the analysis of alpha and beta diversity were calculated on the basis of the rooted phylogenetic tree. In addition to observed OTU richness via alpha diversity, we used the phylogenetic diversity of Faith'DP (a phylogenetic measure of diversity based on total branch length of the bacterial 16S rRNA gene phylogeny captured by each sample) [63].The sampling depth of 100 readings, a reason to normalize the variance, which excluded four samples. Among these, the Kruskal-Wallis test (in pairs, p-value 0,05) was used to assess the statistical signi cance of alpha diversity. For Beta diversity analysis, following QIIME2 diversity plugin, OTUs were identi ed with qualitative (Jaccard and unweighted UniFrac) and quantitative (Bray-Curtis and weighted UniFrac) distance measurements at a sample depth of 100 readings. The statistical signi cance between the different groups was assessed by the permutation-based ANOVA (PerMANOVA) test [64] with 999 permutations (beta group signi cance command in the diversity plugin). Thereafter, Principal coordinate analysis (PCoA) plots were generated by the QIIME2 Emperor tool to explore the structure of the bacterial community. The bar graphs showing the taxonomy levels were generated by the Microsoft excel tool using the OTUs tables generated by Ion Reporter™ Software version 5.18. The package phyloseq was used to perform post-analysis of Alpha and Beta diversity. The Figures were generated by ggplot in R.
To separate reads into different variable regions from the 16S Metagenomics Kit we followed two strategies. The rst using Mothur as it is descripted in Barb et al. [28] and the second by Qiime2 using speci c primers for each region. For the Mothur method, brie y, we aligned the full set of reads with the Mothur script align.seqs using Silva as the reference database. The aligned report le was then submitted to an R script that counts the number of reads with the same start and stop alignment position along the reference database. The forward and reverse reads were then visualized along the translated 16S rRNA gene coordinates and reads were grouped into their corresponding variable regions based on where they aligned.
However for Qiime2 method, The regions V2; V3; V4, V6-7 and V8 were analyzed by the same Qiime2 pipeline described above. in addition, analysis as well as for comparison between the regions were carried out in the R environment (http://www.r-project.org/). Also, the Venn diagram was generated by R package according to different taxonomic levels and the graphical output was in the form of a Venn / Euler diagram. Phylogenetic analysis of Actinobacteria related sequences : To analyze the Actinobacteria sequences obtained in the Meta-taxonomics output, the Actinobacteria assigned sequences were exported from Qiime2 via the "Qiime taxa lter-seqs" command and then transformed to a fasta format to perform the phylogenetic analyses. Subsequently, the sequences of plant sequences were coded by plant name or Bulk via the command "sed 's/>NP1_ */' foo.in >bar.out".
The sequences were then combined in a single queue and aligned by MAFFT. E. coli "NR_024570.1" sequence was set as a reference for the alignment. The output of MAFFT is imported into MEGAX [65] to construct the phylogenetic tree by the Maximum likehoold method with a boostrap of 1000 times. We collected also the matrix of phylogenetic distance and the diversity parameters calculated by MEGAX.
The phylogenetic signal with K and λ were calculated by R via package phytools.
Predicting functional pro les from 16S rRNA data using Tax4Fun The prediction of the functional pro les of bacterial taxa based on 16S rRNA sequencing data was performed with Tax4Fun which an open-source R package using the OTUs    Principal coordinate analysis (PCoA) of bacterial communities from Bulk phosphate mine samples (from Bulk_RP1 to Bulk_RP3), wild phosphate mine plants Alternanthera caracasana (from NP1_1 to NP1_3), Digitaria sanguinalis (from NP2_1 to NP2_3), Dittrichia Viscosa fromNP3_1 to NP3_3) and wheat crop (fromWR_1 to WR_3). Analyses were done by using Jaccard distance matrix and visualized with emperor   phylogenetic tree of Actinobacteria sequences in three native plants samples represented by light colors (pink, sky blue and gray) and bulk phosphate mines samples represented by dark colors (purple, blue and green). The tree was generated by multiple pairwise alignment of sequences assigned to Actinobacteria derived from Qiime2 output using MEGAX and MAFFT Software. The phylogenetic tree was constructed by the Maximum likelihood program and a bootstrap analysis was performed with 1000 trials. E. coli "NR_024570.1" sequence was used as a reference.