Sampling and biological materials from the farming trials
The farming trials were conducted at Rangiroa, an atoll from the Tuamotu archipelago. The reef site named “Avatoru” (Fig. 1) was chosen, in consultation with the maritime authorities, to perform the aquaculture trials since D. metachromia was found naturally growing in that area. The sponge exploitation remained under the regulations of a permanent operating license from the Direction des Ressources Marines (DRM, Tahiti) during the whole study.
The aquaculture frames were directly anchored to the reef as supports of sponge explants coming from nearby parental specimens. These structures consisted of three horizontal, table-shaped iron frames submerged between 16 and 18m depth where the abundance of wild D. metachromia populations was found the highest (Figure S1). Four different sponge explants (roughly 4*4*4cm) were excised in December 2019 from the same original sponge. An additional piece was directly sampled as an uncultivated control, named “Donor”. The four explants were attached to the same frame using a nylon rope directly threaded through their spongin skeleton (Figure S2). Their sampling was then carried out using SCUBA at four distinct times, spaced by 3 months. During every field survey, survival and growth rates of the new sponge generations were calculated and one explant originally coming from the same donor sponge was removed to be subsampled in triplicates. To do so, pieces of D. metachromia specimens were manually excised with a knife to prevent squeezing the animals and transferred into a plastic bag filled with the surrounding seawater. Only small fragments of about 2*2cm were removed from each selected individual, leaving sufficient pinacoderm (i.e. outer layer of sponge cells) intact to allow regeneration after damage.
Samples were named according to their farming duration as follows: “9 months” for those collected in September 2020, “12 months” for December 2020, “15 months” for March 2021 and “18 months” for May 2021 (Table S1). Subsamples were cut using sterilized tweezers and scalpels for the barcoding analysis of the sponge and the metabarcoding study of the prokaryotic community. They were subsequently preserved in sterilized plastic vials filled with 96% ethanol and conserved at -20°C until DNA extraction.
Sampling and biological materials for the biogeographical complementary study
Different geographical scales were considered for the biogeographical study. A first approach was to conduct a global comparison, with samples collected from four different regions: the Red Sea (Saudi Arabia), the Indian Ocean (Maldives), the Makassar Strait (Indonesia), and the South Pacific Ocean (French Polynesia) at different sampling times from 2014 to 2019 (Fig. 1A, Table S2). Then, a regional scale was considered by targeting three different French Polynesian archipelagos.
The sampling of French Polynesian sponges was organized during an oceanographic campaign led by the French “Institut de Recherche pour le Développement” (IRD) from 19th October to 23th November 2018 onboard the Alis oceanographic vessel. Subsamples were taken from donor sponge populations located at 20m depth on the barrier reef of six different islands, encompassing more than 4000km2 and covering three neighboring archipelagos: Rangiroa, Makemo, Raroia, Tematangi (Tuamotu), Mangareva (Gambier) and Tetiaroa (Society) (Fig. 1B, Table S2 and S3).
To analyze differences at a higher geographical resolution, considering differences in terms of prevalent currents and oceanic exposition, four specific sites were chosen within the atoll of Rangiroa (Fig. 1C). Two locations named “Tiputa” and “Avatoru” were located on the external reef on the Western sides of both Tiputa and Avatoru passes. The two remaining areas named “Lagon 1” and “Lagon 2” were located within the lagoon, both facing the Avatoru pass in order to benefit from the semi-oceanic conditions allowed by the daily water exchange.
The sampling and subsampling were then processed as previously described for the study of farming trials.
Environmental parameters
Different devices were used to record temperature and salinity conditions throughout the farming experiment in Rangiroa. First, a temperature data logger (iBee 22L Thermo Button, Plug and Track, Proges Plus, France) was directly attached to one of the frames supporting the explants, to record seasonal temperature variations every hour with an accuracy of 0,1°C. The ThermoTrack PC V8 software was used for device calibration and data processing. Then, for salinity measurements in Practical Salinity Unit (PSU), the AquaSonde-5000 submersible multi-parameter probe (Aquaread, SDEC, France) was deployed using SCUBA at the beginning of each field survey. It was previously calibrated in the laboratory according to the manufacturer's recommendations and using the SondeLink-PC software. Lastly, for the remaining French Polynesian islands investigated in this study except Makemo due to a lack of working probes, temperature and salinity were automatically registered during each stopover using factory sensors directly operated onboard the Alis vessel.
For nutrient analyses, a subsample of 20mL of seawater was fixed with 20µL of HgCl2 in polyethylene scintillation vials (Dominique Dutscher SAS). The vials were vigorously shaken, and then kept at 4°C and away from light exposure. Nutrient content analyses of dissolved phosphate (PO43−), nitrogen oxides (NOx), and orthosilicic acid (Si[OH]4) were performed by the Laboratoire des Moyens Analytiques (LAMA, Nouméa, New Caledonia) using a SEAL AA3 HR AutoAnalyzer.
For chlorophyll a measurement, a subsample of 500 mL of seawater was filtered on a Whatman® GF/F glass-fiber filter (0,7 nominal pore size; 47 mm diameter) to prevent the loss of picoplankton (Aminot and Rey 2001). Filters were then placed in aluminum foils and stored at -20°C. Extraction of chlorophyll a from phytoplanktonic cells was performed using 90% acetone after mechanical grinding of the filters into glass tubes. The 6-mL extracts were then stored overnight in darkness at 4°C and centrifuged the next day a few hours prior to analysis. Concentrations were measured by a Trilogy® Laboratory Fluorometer (Turner Designs) using the chlorophyll a Non-Acidification module. Calibration of the fluorometer was made using a Prim’Light SECOMAM® spectrophotometer.
Barcoding of sponge samples
For phylogenetic analyses, DNA was extracted from all samples (except those from Lagon 1 and Lagon 2) using the Qiagen DNeasy Blood & Tissue kit and following manufacturer protocols for spin column extractions. The sponge 28S rRNA gene was amplified using the primers 28S-C2-fwd (5′-GAAAAGAACTTTGRARAGAGAGT-3′) and 28S-D2-rev (5′-TCCGTGTTTCAAGACGGG-3′) (Chombard et al. 1998) in a 25 µl reaction containing: 7.9µl distilled water (sterile milliQ); 5µl 5x Thermo Fisher Phire Green; 1.0µl 25mM MgCl2; 1.0µl 10mg.ml− 1 Life BSA; 5.0µl 5x Qiagen Q-solution; 1.3µl 10 pMol.µl− 1 forward primer; 1.3µl 10 pMol.µl− 1 reverse primer; 0.5µl 2.5mM dNTP; 0.5µl Thermo Fisher Phire II HS Taq and 1.5µl 0.05-0.5ng.µl− 1 DNA template, with the following amplification parameters: initial denature of 98℃ for 30 sec, followed by 40 cycles of 98℃ for 10 sec, 52.5℃ for 10 sec and 72℃ for 15 sec and a final extension of 72℃ for 5 min. Sponge mitochondrial cytochrome oxidase subunit 1 (COI mtDNA) gene was amplified using the primers dgLCO1490 (5′-GGTCAACAAATCATAAAGAYATYGG-3’) and dgHCO2198 (5′- TAAACTTCAGGGTGACCAAARAAYCA-3′) (Meyer et al. 2005) in a 25 µl reaction containing: 12.9µl distilled water (sterile milliQ); 5µl 5x Thermo Fisher Phire Green; 1.0µl 25mM MgCl2; 1.0µl 10mg.ml− 1 Life BSA; 1.3µl 10 pMol.µl− 1 forward primer; 1.3µl 10 pMol.µl− 1 reverse primer; 0.5µl 2.5mM dNTP; 0.5µl Thermo Fisher Phire II HS Taq and 1.5µl 0.05-0.5ng.µl− 1 DNA template, with the following amplification parameters: initial denature of 98℃ for 30 sec, followed by 40 cycles of 98℃ for 10 sec, 55℃ for 10 sec and 72℃ for 15 sec and a final extension of 72℃ for 5 min. PCR product yield was checked using Invitrogen E-gels 96 2% agarose.
PCR products were sent to Baseclear B.V. in the Netherlands for forward and reverse sanger sequencing using the same primers than those of the PCR amplification. The resulting sequences were analyzed and edited in Geneious Prime 19.2.3. Reads were assembled using a De Novo Assembly. Contigs were stripped of primer sequences and analyzed for reverse complements and quality. Sequences were aligned in Geneious using MAFFT v7.450 (Katoh et al. 2002). A phylogenetic tree was constructed using MrBayes 3.2.6 (Huelsenbeck et al. 2001). A GTR substitution model with ‘invgamma’ rate variation was preferred, setting the chain length to 1100000 with a subsampling frequency of 200 and a burn-in length of 100000.
A first Mantel test was conducted to test the correlation between the phylogenetic distance matrix (based on the 28S rRNA gene) and the geographical distances matrix. The latter was obtained using the GPS coordinates of each sampling site and the earth.dist() function from the “fossil” R package (Vavrek 2011).
DNA extractions, library preparation and high throughput sequencing of 16S rRNA gene amplicons.
For the microbiome analysis, DNA was extracted using the FastDNA™ SPIN Kit for Soil (MP Biomedicals, Inc.) following the manufacturer instructions. Sponge tissues were cut into small pieces of approximately 3*1*0.5mm using sterilized tweezers and scalpel blades. An extraction blank, in which no tissue was added to the tube, was also included.
The library preparation was conducted through a two-step PCR protocol for all samples, together with the extraction blank and two negative controls (mQ water instead of template DNA). For the first PCR, the V4-V5 regions of the 16S rRNA gene were targeted and amplified (Parada et al. 2016). We used 515F-Y/926R primers and the KAPA HiFi HotStart Ready Mix PCR Kit (Roche Molecular Systems, Inc.). Reactions were performed in a T100 Thermal Cycler (Bio-Rad, Hercules, CA, United States). The following thermal cycling scheme was set up: initial denaturation at 95°C for 3 min, 30 cycles of denaturation at 98°C for 20 s, annealing at 50°C for 30 s, followed by extension at 72°C for 30 s. The final extension was carried out at 72°C for 5 min.
PCR products from the samples were checked using an E-Gel™ (agarose gels at 2%) and the absence of amplification was validated for the negative controls and the extraction blank. PCR products were then cleaned with NucleoMag NGS-Beads (bead volume at 0.9 times the total volume of the sample, Macherey Nagel, Düren, Germany) using the VP 407AM-N 96 Pin Magnetic Bead Extractor stamp (V&P Scientific, San Diego, CA, United States). Through a second PCR, 3 µL of the cleaned PCR products were then amplified and labeled using the MiSeq Nextera XTDNA library preparation kit (Illumina, San Diego, CA, United States), with the same thermal cycling scheme limited to 8 cycles. PCR products were then analyzed with the Fragment Analyzer Agilent 5300 using the DNF-910-33 dsDNA Reagent Kit (35–1,500 bp) protocol (Agilent Technologies, Santa Clara, CA, United States) to confirm successful labeling of the DNA fragments. Negative controls and extraction blanks remained negatives.
The pooling at the equimolar concentration was performed with QIAgility (Qiagen, Hilden, Germany). The final pool was then cleaned with NucleoMag NGSBeads, eluted in Milli-Q and the DNA concentration was verified using Tapestation 4150 (Kit HSD 5000, Agilent Technologies, Santa Clara, CA, United States). The sequencing was performed on an Illumina MiSeq V3 PE300 platform at BaseClear B.V. (Leiden, the Netherlands).
16S rRNA gene metabarcoding data processing
The raw reads were first treated by BaseClear B.V. for demultiplexing (using bcl2fastq version 2.20, Illumina) and filtering based on two quality controls (using Illumina Chastity filtering, and a PhiX control signal filtering). The following reads were then processed with the DADA2 workflow, allowing an inference to Amplicon Sequence Variant (ASV) (Callahan et al. 2016a, 2017). The “dada2” R package was used following the workflow described in Callahan et al. (2016b) and guidelines described in the online tutorial (https://benjjneb.github.io/dada2/tutorial.html). The parameters used for filtering and trimming reads were the following ones: truncation length of 290 and 240 base pairs for forward and reverse reads, respectively, maxN = 0, maxEE = 2 and truncQ = 2. The error learning step was based on 153 188 440 and 126 776 640 bases for forward and reverse reads, respectively. After the construction of the ASV table, chimeric sequences were filtered and taxonomic assignment was performed using the Silva v138 reference database (Quast et al. 2013).
The ASV and taxonomy tables produced by the pipeline were then combined into a phyloseq object, together with the sample metadata table using the “phyloseq” R package (McMurdie and Holmes 2013). The dataset was then filtered by removing all sequences affiliated to Eukaryota, chloroplast and mitochondria. Data was then decontaminated with two negative controls and the extraction blank used as control samples, using the “decontam” R package (Davis et al. 2018). The dataset was then divided into two subsets, one with the samples for the farming experiment and the second one with those for the biogeographical study. The sampling time (Donor, 9 months, 12 months, 15 months, and 18 months) and sampling sites were used as factors for the statistical analyses (Table S1 and S2).
Metabarcoding and statistical analyses
After the filtering of reads affiliated to Eukaryota, chloroplast and mitochondria (representing together an average of 0.27% of all reads per sample, SD = ± 0.34%), rarefaction curves performed with the “vegan” R package (Oksanen et al. 2019), reached a plateau for all samples (Figure S3). Since the default DADA2 pipeline is performing a removal of singletons and thus considering the potential loss of very rare taxa, these rarefactions curves and the corresponding α-diversity metrics should be interpreted with caution. Despite the removal of these singletons, we remain confident about the relevant coverage of the richness displayed in the whole study.
The α-diversity measures were estimated with Chao1 (estimated richness), Pielou (evenness) and Shannon (both richness and evenness) indices using the “ade4” and “phyloseq” R packages (Dray and Dufour 2007; McMurdie and Holmes 2013) and the rarefied datasets (rarefaction performed to the minimum library size, i.e. 31559 and 34103 reads, for the farming experiment and the biogeographical datasets respectively). Depending on the normal distribution assessed using a Shapiro test, the significant effect of the sampling times and sites on the diversity metrics was tested using ANOVA followed by HSD Tukey’s tests as parametric tests, as well as Kruskal-Wallis followed by pairwise Wilcoxon tests as non-parametric ones. Following recommendations for compositional approaches from McMurdie and Holmes (2014) and Gloor et al. (2017), all other analyses were conducted without rarefaction, using the phyloseq R package and the datasets normalized to the total number of sequences per sample. For both datasets, the β-diversity of all samples was analyzed with a non-metric multidimensional scaling (NMDS) using the Bray-Curtis dissimilarity. Differences of β-diversity between groups were statistically checked with one-way PERMANOVA tests followed by pairwise Adonis tests.
Distance-based Redundancy Analysis (db-RDA) using the Bray-Curtis dissimilarity was conducted to investigate the effect of environmental parameters on the β-diversity of the prokaryotic community of D. metachromia, using the “vegan” R package. A first analysis was performed with all samples from the farming dataset and tested with all available environmental parameters (temperature, salinity, PO43−, NOx, Si[OH]4, and chlorophyll a). Since some parameters were not described for all sampling sites from the biogeographical study, the analysis was conducted on a restricted dataset with a selection of samples for which the temperature, salinity, PO43−, NOx and Si(OH)4 were measured (Table S2 and S3). This restricted biogeographical dataset includes only samples from French Polynesian sites (except Makemo). The ordiR2step() function was used to verify that all these environmental factors can be selected as significant constraints to be used as explanatory variables. The overall significance of the two models was tested using the anova.cca() function with 999 permutations.
Following db-RDA, variance partitioning analyses (Borcard et al. 1992; Cleary et al. 2022) were conducted using the varpart() function from the “vegan” package for both the farming and the restricted biogeographical datasets. These analyses were performed to assess the relative percentage of the variance of the β-diversity explained by the selected environmental parameters, using the formula (~ temperature, ~ salinity, ~ PO4 3− + NOx + Si[OH]4) for both datasets. Additionally, the variance of the β-diversity explained by environmental parameters was also assessed through a variance partitioning analysis combining the geographical distances. The geographical distances matrix was transformed using the pcnm function and the analysis was conducted using the following formula: (~ temperature + salinity + PO4 3− + NOx + Si[OH]4, ~ pcnm[geographical.dist]$vectors).
For the biogeographical study, Mantel tests (Spearman correlation, 9999 permutations) were conducted within all samples to evaluate the correlation between the β-diversity distance matrix and the geographical distance matrix. Excepted for samples not barcoded (Lagon 1 and Lagon 2, Fig. 1, Table S2), an additional Mantel test was also conducted with the biogeographical dataset to assess the correlation between the β-diversity distance matrix and the phylogenetic distance matrix.
The "core community" was obtained using the “microbiome” R package (Lahti et al. 2017) with the core() function: a prevalence threshold of 90% in all samples (including both geographical and farming datasets) and an abundance threshold of 0.01% were kept. Percentages of core ASVs numbers and sequences were estimated by comparison with the total community and statistically tested using Shapiro, ANOVA and HSD Tukey’s tests.