Conservation Genetics and Gut Microbial Communities’ Variability of the Critically Endangered European Mink Mustela Lutreola: Implications for Captive Breeding Programs

: Host’s fitness can be affected by its genotype and gut microbiota, defined as the microbes 15 living in the host’s intestinal tract. This study explored how the genetic diversity of the host 16 influences its bacterial communities in the context of captive breeding programs, for the 17 critically endangered European mink ( Mustela lutreola ). As stated by the ecosystem on a 18 leash model, loss of host genetic diversity may lead to changes in immunomodulation and 19 will therefore induce modifications of the gut microbiota. We investigated variation in the 20 gut bacteria through 16S rRNA metabarcoding, related to the genetic diversity of European 21 mink held in captivity in two breeding centers representing separate breeding stocks 22 originating from the western and eastern populations. The genetic diversity of the host was 23 assessed through diversity analysis of the adaptive MHC class I and II genes as well as neutral microsatellite markers. Results indicate lower diversity in neutral and MHC class I 25 genes for the western population, and the opposite for MHC class II. A lower MHC class 26 II gene variability led to an increase in microbial phylogenetic diversity and in abundance 27 depending on the presence of specific MHC-II motifs. Those results seem to be linked to 28 management practices that differs between the two programs, especially the number of 29 generations in captivity. Long term Ex situ conservation practices can thus modulate gut 30 microbial communities, that might potentially have consequences on the survival of 31 reintroduced animals. We suggest strategies to foster genetic diversity in captive breeding 32 program to mitigate the effects of genetic drift on those small, isolated populations.


Introduction 38
More than 5,800 animal species to date are endangered, as the Earth experiences a mass 39 extinction event (Ceballos et al., 2015). Intrinsic drivers of extinction, such as genetic 40 factors, play a key role for population viability, especially when species are reduced to 41 small, isolated populations that can be negatively affected by genetic load (Hedrick, 2001). 42 In this scenario, finding a suitable mate is challenging and reproduction with related 43 individuals can occur, leading to inbreeding depression. Inbreeding has largely been 44 documented in small populations in the wild (Hedrick, 2001; reviewed in Spurgin & Gage,45 2019), impacting population fitness through the fixation of detrimental alleles. An increase 46 with a depth of 50,000 reads/sample for MHC genotyping and was completed using an 205 Illumina NovaSeq platform (Illumina Biotechnology Co., Novogene, UK). 206 To analyze MHC-I and MHC-II amplicon sequences, we used the three-step pipeline 207 AmpliSAS (Sebastian et al., 2015). Low-quality sequences with Phred scores lower than 208 20 were removed and clustering was conducted using the default parameters for Illumina 209 sequences. Already identified alleles of MHC-II DRB for E-mink were extracted from 210 NCBI (Becker et al., 2009), as well as sequences from related species (Mustela putorius 211 and Mustela itatsi) for MHC-I exon 2. If NCBI blast reveled 100% of sequence identify 212 between the discovered alleles in this study and already identified one, their name was 213 replaced by the accession number of these sequences. For the subsequent analysis, we 214 focused on the amino acid translated sequences (referred as MHC motifs) as they are in 215 direct contact with bacteria. We measured motif richness as the number of sequences per 216 individual for each locus. We calculated functional distances between individuals 217 following the approach described in Strandh et al. (2012). A maximum-likelihood tree was 218 constructed based on the chemical binding properties of the amino acids, as described by 219 five physico-chemical descriptor variables (z-descriptors) for each amino acid, using 220 sequences of Meles meles, Meles leucurus, Meles anakuma and Martes zibelina as out-221 group retrieved through NCBI blast ( Figure S1). The trees were used as reference from 222 which the functional distances between individuals were calculated using unweighted 223 UniFrac for both genes (Lozupone & Knight, 2005). Following Bolnick et al. (2014), the 224 genetic distance between each amino acid sequences within each individual (Faith's PD) 225 were calculated, and further defined as motif divergence. 226

Statistical analysis for MHC and microsatellites markers between populations 227
For subsequent analyses, individuals were sorted according to their population origin 228 (western and eastern) that corresponded to structure results (admixed individuals (q < 0.9) 229 were excluded = 0 individuals). Non-parametric Kruskal-Wallis rank sum tests were used 230 to investigate neutral genetic variation with F and MLH calculated from microsatellite 231 markers analysis, between the two E-mink populations and host sex. Adaptive genetic 232 diversity and variation between E-mink populations and individual sex was also observed 233 using the same approach for motif richness for both MHC genes and divergence. 234 Permutational multivariate analyses of variance (PERMANOA) models adonis from the 235 vegan package were constructed with 9,999 permutations with reported F, R2, and p-236 values, to determine whether there were significant differences in genetic distance matrices 237 for neutral and adaptive markers between E-mink population, individual sex and birth 238 location. Pearson's pairwise correlation tests between the presence of MHC motifs, present 239 in at least three individuals, were conducted to potentially represent haplotype blocks for 240 the two genes investigated. The significance cutoff was set to p-value < .05 for each test. 241 Mantel tests were also conducted between each neutral, MHC-I and MHC-II genetic 242 distance matrices with 1,000 permutations to investigate correlation between each marker. 243

Microbiota data generation and processing 244
After DNA extraction, the targeted gene for bacterial taxonomic affiliation using broad 245 bacterial primers of the region V4 of the 16S rRNA gene (515F-806R, 390 bp) was 246 amplified through PCRs. Amplification, library preparation and sequencing were carried 247 out in a similar manner to MHC genotyping, with a depth of 100,000 reads/samples of two 248 libraries composed of 48-52 samples.

12
The quality controls of the demultiplexed paired-end reads were performed through the 250 software FastQC (Andrews, 2010). Demultiplexed sequence reads denoising and amplicon 251 sequence variants (ASVs) picking steps were done with the QIIME2 tool (Bolyen et al.,252 2018; v. 2019.1), using the DADA2 pipeline (Callahan et al., 2016;Callahan et al., 2017). 253 Samples were pooled by individuals to limit bias from diet foods prior to rarefaction. 254 Rarefaction was conducted at 27,000 reads/samples in sampling depth. ASVs-or also 255 referred to as bacterial phylotypes-were then screened to the 97% 16S rRNA gene full-256 length reference sequences from the Silva v.132 database (Pruesse et al., 2007) for 257 taxonomical association using the VSEARCH classifier implemented in QIIME2 258 (Bokulich et al., 2018). Sequence alignment and phylogeny building were conducted in 259 QIIME2 for the construction of UniFrac distance matrices. 260

Statistical analysis for α-diversity of gut bacteria according to host information 261
All statistical analyses were conducted using the phyloseq and microbiome packages for 262 manipulation of data. Total observed number of bacterial taxa, Chao1, Shannon's diversity 263 index and Faith's PD in each sample were used as metrics to measure the α-diversity of gut 264 bacteria between samples. Chao1 characterizes the overall phylotype richness within a 265 host, Shannon's diversity index considers richness and abundance, and Faith's PD is an 266 indicator of genetic diversity within a sample. Differences in the indices according to E-267 mink populations, sex and birth location of the host were analyzed using Kruskal-Wallis 268 rank sum tests. Linear regression models were also conducted with the different measures 269 of the microsatellites and MHC analysis (MLH, F for microsatellite data, MHC motif 270 richness and divergence) as predictors, and microbial richness indexes as response variable. with gut microbial composition, we employed Mantel tests between each genetic distance 284 and Unifrac distance. 285 A differential abundance analysis was conducted on the raw ASVs count (after filtering, 286 prior to rarefaction) that were present in more than 10% of all the samples and that had a 287 relative abundance of more than 5% among all taxa. It corresponds to the core microbiota 288 of the dataset, represented by 1203 phylotypes. The abundance analysis was made at the 289 ASV level with the DESeq2 package, using a negative binomial Wald test to test 290 significance in contrast between each E-mink population and each common MHC motifs 291 that were present in at least two individuals. Only microbial ASV with a significance level 292 (α) below .001 after false discovery rate (FDR) corrections were considered using the 293 14 tested for Pearson correlations between microbial genera' relative abundance per 295 individual and continuous genetic variables (MLH, motif richness and PD for both MHC 296 class I and II) with FDR corrections, for genera that were encountered at least in three 297

Genetic diversity of the European mink captive populations 300
The microsatellite markers analysis demonstrated an overall population allelic richness per 301 locus of 2.69 with an average of 2.49 in the western population, and 2.82 in the eastern 302 population. Heterozygosity values were lower in the western population compared to 303 eastern population (Table 1). Bayesian assignment recovered two genetic clusters within 304 our population, and no admixture pattern were detected. All individuals clustered according 305 to populations, corresponding to the two different breeding facilities. The offspring with 306 parents of each population was assigned to the western population according to clustering 307 (p(Kwestern)=0.975). Multilocus heterozygosity was slightly higher in eastern than western 308 populations (Kruskal-Wallis: χ 2 =3.4761; p-value=0.0623), but the inbreeding coefficient 309 (F) was not (Kruskal-Wallis: χ 2 =0.085714; p-value=0.7697). Overall, sex and birth 310 location had no significant effect on neutral markers' diversity and richness. 311 PERMANOVA on genetic distance based on the microsatellite markers detected no 312 variation according to E-mink population, sex and birth location (Table 2).  Table S2). Most of the variation encountered in both genes was 328 expressed in amino acid residues that influence the binding of CD4 and CD8 glycoproteins 329 involved in antigen presentation for adaptive immunity ( Figure S3). 330 Motif richness and divergence (Faith's PD) were significantly higher in the western E-331 mink compared to eastern E-mink for MHC-II gene (Kruskal-Wallis: χ 2 =13.456, p-332 value=0.0002; χ 2 =8.0614, p-value=0.0045; respectively). However, for MHC-I, 333 divergence was higher in eastern E-mink compared to western E-mink, but not motif 334 richness (Kruskal-Wallis: χ 2 =5.0097, p-value=0.0252; χ 2 =1.5456, p-value=0.2138, 335 respectively). No changes in motif richness nor divergence were observed according to sex 336 for the two genes. However, we did observe significant variation in MHC-II richness 337 according to birth location (Kruskal-Wallis: χ 2 =10.854, p-value=0.0125), and a Dunn test 338 with Benjamini-Hochberg correction only detected higher motif richness for the MHC-II 339 gene in captive-born E-mink in Spain compared to the EEP (Dunn: Z=-2.748, adjusted p-340 value=0.0358). PERMANOVA detected a significant influence of E-mink sex for MHC-I 341 genetic distance, as well an influence of mink population close to the significance threshold 342 ( Figure S5), whereas E-mink population was the only variable influenced MHC-II 343 composition variation (Table 2). Finally, Mantel tests showed a positive correlation 344 between MHC -I and neutral markers distances (Mantel: r=0.2761, p-value=0.001). 345

α-diversity of gut bacteria according to host information 346
A sample of mock community containing known concentrations of genomic DNA from 20 347 bacterial strains was sequenced. 19 of the 20 different strains originally included in the 348 sample were detected. The undetected strain was present at the lowest concentration. 349 Therefore, our protocol allowed bacterial DNA detection and identification to the genus 350 level as long as its concentration in the DNA extract was at least 2.8 pg/μl, and provided 351 that the sequence was included in the reference database. particularly strong estimates for MHC-I richness ( Figure S4). 368

β-diversity of gut bacteria between E-mink and differential abundance 369
Beta diversity analyses revealed that bacterial communities were significantly different in 370 composition according to E-mink population only when considering unweighted Unifrac 371 distances, whereas no significant differences in microbial community composition were 372 found between host sexes nor birth locations (Models 4 and 6, Table 2). Moreover, MHC-373 II gene richness had a small significant influence on gut microbial composition (Model 5, 374 Table 2). This is reflected in the results of the PCoA, which clustered individuals with  We recovered several bacterial genera and families whose relative abundances were 381 significantly correlated with MHC-I and MHC-II richness and divergence, while some 382 marginally correlated with multilocus heterozygosity of neutral markers (Table 3) Overall, our prediction that less host control will be observed in mink with lower genetic 474 diversity is supported by both alpha and beta diversity for the E-mink. However, both 475 populations have low genetic diversity, and the MHC class II DRB gene seemed to have a 476 stronger influence in gut microbes than other markers. To further validate our results, 477 replicating the study to see if those differences are observable when individuals from the 478 22 two populations are kept in the same facility to control for the influence of the external 479 environment should be conducted. Given that we only had access to samples from a small 480 fraction of the captive eastern population, our results might also not be representative of 481 the entire captive breeding stock. Despite the gut microbiota variation being a complex 482 puzzle, our study gives more importance to host immunogenetics in the context of species 483 conservation. 484

Adaptation to captivity and management practices 485
For MHC genes, rare allele and heterozygous advantage are two types of balancing 486 selection that have been suggested to be important in maintaining high levels of adaptive 487 genetic diversity (Sommer, 2005). Assuming that rare and divergent MHC genotypes are 488 more likely to induce host control on gut microbes, giving a fitness advantage to the host, 489 the co-evolutionary arm race with gut microbes will foster adaptation from microorganisms 490 One strategy is to translocate animals between breeding centers for reproduction to prevent 518 loss of genetic diversity. Similar to the western captive population of E-mink, these 519 translocations could be composed of wild-born individuals, free of captive selection 520 pressure (Schulte-Hostedde & Mastromonaco, 2015). Occasional translocations from 521 western to eastern captive populations could also be conducted and would potentially 522 mitigate the modest reproductive success within the program. It is worth noting that wild-523 born animals have been out of reach from the EEP breeding stock so far. However, 524 conducting preliminary MHC variation assessment on reintroduced animals from the 525 eastern stock present in Hiiumaa island, as they no longer face captivity for a number of 526 generations, could be used to identify potential assets to the current breeding stock. 527 Captivity has been shown to alter gut microbial communities (McKenzie et al., 2017).    Figure 5. Mean relative abundance of each family that experienced significant differential 626 abundance between the two mink populations (orange: Western; blue: Eastern, mean 627 relative abundance per population + standard error). Colored circles next to the family 628 name correspond to which MHC motifs absence/presence the variation in abundance was 629 significant for (yellow: MHCII, blue: MHCI, green: both MHCI and MHCII). 630 631