Hill-based dissimilarity indices and null models for analysis of microbial community assembly

doi:10.21203/rs.2.24335/v1

Download PDF

Methodology

Hill-based dissimilarity indices and null models for analysis of microbial community assembly

https://doi.org/10.21203/rs.2.24335/v1

This work is licensed under a CC BY 4.0 License

Journal Publication

published 11 Sep, 2020

Read the published version in Microbiome →

Version 1

posted

You are reading this older preprint version

Read the latest preprint version →

Background: High-throughput amplicon sequencing of marker genes, such as the 16S rRNA gene in Bacteria and Archaea, provides a wealth of information about the composition of microbial communities. To quantify differences between samples and draw conclusions about factors affecting community assembly, dissimilarity indices are typically used. However, results are subject to several biases and data interpretation can be challenging. The Jaccard and Bray-Curtis indices, which are often used to quantify taxonomic dissimilarity, are not necessarily the most logical choices. Instead, we argue that Hill-based indices, which make it possible to systematically investigate the impact of relative abundance on dissimilarity, should be used for robust analysis of data. In combination with a null model, mechanisms of microbial community assembly can be analyzed. Here, we also introduce a new software, qdiv, which enables rapid calculations of Hill-based dissimilarity indices in combination with null models.

Results: Using amplicon sequencing data from two experimental systems, aerobic granular sludge (AGS) reactors and microbial fuel cells (MFC), we show that the choices of bioinformatics pipeline and dissimilarity index can have considerable impacts on results and conclusions. Analysis of the AGS data set showed that results are sensitive to bioinformatics choices when dissimilarities between sample groups are compared with incidence-based indices. Analysis of the MFC data set with a combination of Hill-based indices and a null model revealed that random dispersal could explain the distribution of both rare and highly abundant taxa within a glucose-fed MFC whereas the distribution of taxa of intermediate relative abundance was governed by heterogeneous selection.

Conclusions: Hill-based indices provides a rational framework for analysis of dissimilarity between microbial community samples. In combination with a null model, the effects of deterministic and stochastic factors on taxa of low-, intermediate-, and high relative abundance during microbial community assembly can be systematically investigated. Calculations of Hill-based dissimilarity indices in combination with a null model can be done in qdiv, which is freely available as a Python package (https://github.com/omvatten/qdiv).

General Microbiology

Aerobic granular sludge

Amplicon sequencing

Beta diversity

Bioinformatics

Microbial ecology

Microbial fuel cell

Microbial communities drive global cycles of elements and play important roles for human health, food production, and environmental engineering services such as wastewater treatment. On Earth, there may be as many as 10¹² different microbial species [1] and understanding how communities assemble, develop, and function is a formidable task. During the last decades, significant progress in DNA sequencing technology has provided a wealth of information about the diversity of microbial communities in both natural and engineered environments. Polymerase chain reaction (PCR) amplification of parts of the 16S rRNA gene followed by high-throughput sequencing using platforms such as 454 pyrosequencing, Illumina, Ion Torrent PGM, and PacBio has made it possible to probe millions of sequences in samples. For example, the Illumina MiSeq platform and dual-indexing of PCR primers allow over 100 samples to be sequenced in parallel at a depth exceeding 10 000 reads per sample [2, 3]. In addition to the rRNA gene, PCR targeting functional genes, such as the amoA in ammonia-oxidizing bacteria, can be used to study specific functional groups [4].

Interpretation of results from high-throughput amplicon sequencing experiments is, however, challenging. Varying copy numbers of the target gene, sampling, DNA extraction, PCR amplification, and sequencing can all lead to biases, which distort the relative proportions of taxa in a sample [5–7]. For example, Gonzalez et al. [8] showed that taxa with low abundance are typically underrepresented in PCR-based assays. PCR and sequencing also produce error-containing sequences [9]. Several computational pipelines can be used to differentiate between correct and erroneous sequence reads. After quality filtering, the reads are typically clustered into operational taxonomic units (OTUs), which are formed by grouping sequences that are similar. A similarity threshold of 97% has commonly been used. Recently, alternative approaches, which instead of OTU-clustering denoise the reads and derive exact biological sequences, have been developed [10–12]. The denoiser algorithms use different methods to differentiate between true amplicon sequence variants (ASVs) and errors. The generated ASVs can differ from each other by as little as one nucleotide, which makes it possible to investigate microbial diversity at higher resolution [e.g. 13]. Another advantage is that the ASVs represent true sequences and can be compared to results from other sequencing runs. In OTU clustering, the centroid sequences which represent the OTU, as well as the classification of a read to an OTU, depend on all the other sequences in the run [14]. Thus, OTU sequences do not have a meaning outside of the specific context in which they are generated [15].

Once OTUs or ASVs have been determined, it is often of interest to study compositional differences between microbial communities in samples collected from different locations or time points (beta diversity). Indices describing the similarity or difference between sampled communities using a single number are commonly used. Many dissimilarity indices are available [16, 17]. Some, such as the Jaccard and Sørensen indices, are incidence-based, which means they do not consider differences in relative abundance of different OTUs/ASVs. Other indices take the relative abundance into account. In microbial community assays it is difficult to know how much weight should be put on the relative abundance of individual OTUs/ASVs. On the one hand, we know that the read abundance and the true relative abundance of microorganisms do not always correlate in PCR-based assays [18]. Rare OTUs/ASVs often are underrepresented [8] but can play important roles for community function [19]. It may therefore be tempting to use indices that weigh detected OTUs/ASVs equally. On the other hand, we know that PCR and sequencing cause errors, which may remain in the dataset after bioinformatics processing [9, 20]. Microbial communities typically also contain a long tail of extremely low-abundant taxa and random sampling affects the observed dissimilarity [5]. This view would favor the use of an index giving higher weight to abundant OTUs/ASVs; and indeed, the Bray-Curtis index, which takes relative abundance into account, is probably the most commonly used taxonomic dissimilarity index in microbial ecology.

There are, however, other indices that deserve more attention. Hill numbers are a set of diversity indices for which the weight given to the relative abundance of a species can be varied. Hill numbers, which are also called effective number of species, were originally presented as measures of alpha diversity, i.e. species diversity within a community [21]. Eq. 1–2 show how Hill numbers are calculated. The diversity order (q) determines the weight given to the relative abundance of an OTU/ASV in a community. For example, if q is 0, the relative abundance is not considered; if q is 1, the OTUs/ASVs are weighted exactly according to their relative abundance; and if q is higher than 1, more weight is given to OTUs/ASVs having high relative abundance.

D is the Hill number, q is the diversity order, S is the total number of species, and p_i is the relative abundance of the i^th species in the community.

For two or more communities, Hill numbers can be decomposed into alpha (α), gamma (γ), and beta (β) components [22]. ^qD_α is the Hill number for a single community, ^qD_γ is the Hill number for the combined communities (i.e. the regional community), and ^qD_β is the ratio between the two (Eq. 3).

The parameter ^qD_β is, thus, a measure of the difference between the communities being compared and it can be converted to a dissimilarity index constrained between 0 and 1 [23] (Text S1, supplementary material). In this paper, we denote the Hill-based dissimilarity index as ^qd. For a pair of samples, the ^qd value represents the effective average proportion of OTUs/ASVs in one sample not shared with the other sample [24].

The use of Hill numbers is more common in the macroecological literature, both as measures of alpha diversity and for partitioning of diversity [25]. For microbial community studies using high-throughput amplicon sequencing, Hill numbers have also been recommended as measures of alpha diversity [26, 27]. However, Hill-based indices are rarely used to quantify beta diversity. In two recent studies, we used Hill-based dissimilarity indices of specific diversity orders to quantify differences between microbial communities, giving different weight to the relative abundance of OTUs/ASVs [28, 29]. In this paper, we will show that examining dissimilarity (^qd) for a continuum of diversity orders is a rational approach to illustrate how OTUs/ASVs with different relative abundance contribute to the dissimilarity between communities.

A difficulty with analyzing beta diversity, irrespective of the chosen index, is the interpretation of the results. We might be interested in determining if deterministic factors select for the same or different OTUs/ASVs in two sampled habitats or if the distribution of OTUs/ASVs between the habitats is governed by stochastic factors. The dissimilarity value alone tells us nothing about this. For example, if two habitats have different areas for microbial growth, the habitat with the larger area will likely have higher richness (number of detected OTUs/ASVs) because of the taxa-area relationship [30]. Since alpha- and beta diversity are not independent (Eq. 3), the richness difference will cause a high observed dissimilarity even if the two habitats select for the same OTUs/ASVs [31, 32]. Null models are useful in the interpretation of dissimilarity values and allow us to differentiate between different community assembly mechanisms [32, 33]. A null model introduced by Raup and Crick [34] and developed by Chase et al. [32] controls for richness differences between samples. Two samples with pre-defined numbers of OTUs/ASVs are randomly assembled from a regional pool. The regional pool would consist of all OTUs/ASVs detected in the two samples combined and could also include OTUs/ASVs detected in other samples that could possibly colonize the studied habitat. The random assembly process is repeated many times and a null distribution for the dissimilarity between the two samples is generated. This generated null distribution is then compared to the actually observed dissimilarity. If the values are similar, the observed dissimilarity can be explained by stochastic factors. If the observed dissimilarity is higher or lower than the null expectation, there are likely deterministic factors that favor different or similar taxa in the two habitats [33]. The Raup-Crick model was originally developed for incidence-based data [32, 34] and was recently extend to also function with the Bray-Curtis index [35]. In this paper, we further extend the Raup-Crick null model to function with the whole continuum of Hill-based dissimilarity indices (^qd) (Text S2, supplementary material). The index, here denoted as the Raup-Crick index for diversity order q (^qRC), is calculated using Eq. 4.

N _{[qdexp<qdobs]} is the number of randomizations in which the dissimilarity between the randomly assembled samples is less than between the observed samples, N_{[qdexp=qdobs]} is the number of randomizations in which the dissimilarities are equal, and N_TOT is the total number of randomizations.

The goal of this study is to show how bioinformatic pipelines and choice of dissimilarity index impact the results from high-throughput amplicon sequencing experiments. We examine sequencing data from a new experiment with aerobic granular sludge (AGS) reactors and we re-analyze a previously published data set [28] from a study with microbial fuel cells (MFCs). In the AGS experiment, we test the hypothesis that two bioreactors started from the same inoculum and operated under identical conditions for 150 days exhibit the same change in microbial community composition compared to the inoculum. In the MFC experiment, we test the hypothesis that microbial communities growing in different habitats within a glucose-fed MFC are more similar than microbial communities growing in different habitats within an acetate-fed MFC. We show that the conclusions from an experiment may differ depending on the chosen pipeline settings and dissimilarity index. We propose that a solution to this problem is to analyze community dissimilarity for a span of diversity orders using Hill-based indices, and we demonstrate that for the whole range of dissimilarity indices, null models can be used to disentangle community assembly mechanisms. Finally, we introduce a free software and Python package, qdiv, which enables rapid and simple calculations of the indices. Our study focuses on taxonomic dissimilarity indices. The presented methods could, however, be extended to indices taking phylogenetic relationships into account.

Behavior of Hill-based dissimilarity indices and the ^qRC null model

Count tables from microbial community surveys typically consist of a few highly abundant OTUs/ASVs and many low-abundant ones. Using a highly simplified count table (Table S1, supplementary material), we demonstrate how the Hill-based dissimilarity indices behave in comparison to the Jaccard and Bray-Curtis indices, which are more commonly used in microbial community studies.

First, let us consider the situation when samples have equal richness, i.e. the same numbers of detected species (Fig. 1A). Four samples (S0, S1, S2, S3) each have 2 abundant, 4 intermediate, and 8 rare species. Samples S0 and S1 share 1 abundant, 2 intermediate, and 4 rare species. As expected, the Hill-based dissimilarity (^qd) between S0 and S1 is 0.5 for all values of q. Sample S0 and S2 share half of the rare and intermediate species, but none of the abundant species and consequently ^qd goes towards 1 as q increases. Samples S0 and S3 share all intermediate species, but only 1 of the abundant and 1 of the rare, and consequently we see a valley in the ^qd vs q curve. When the samples have the same species abundance distributions, the Bray-Curtis dissimilarity is identical to ¹d. However, for S4, which has the same richness as S0 but a different species abundance distribution, the Bray-Curtis index is different from ¹d.

Second, let us consider the situation when samples have unequal richness (Fig. 1B). Samples S5-S7 have only two species each. In S5, those two species are the same as the most abundant ones in sample S0 and consequently, ^qd decreases with increasing q. In S6, the two species are the same as two intermediates in S0 and we can see a valley in the curve. In S7, the two species are the same as two rare ones in S0 and the dissimilarity increases with q. The Bray-Curtis index shows a different behavior. For S0-S5, Bray-Curtis is equivalent to Hill-based dissimilarity with a low diversity order (q) of 0.52 and for S0-S6 and S0-S7 it is equivalent to diversity orders (q) much higher than 2.

Using the ^qRC null model, we can compare the observed dissimilarity between two samples to the expected dissimilarity if the two sampled communities had been randomly assembled from a regional species pool. In Fig. 1C-D, the sample pair S0-S3 is used as an example. For values of q close to 0, the observed dissimilarity is higher than the null expectation and consequently ⁰RC is 1. For higher values of q, the observed dissimilarity is close the null expectation and consequently the ^qRC values are intermediate, i.e. neither close to 0 or 1 (Fig. 1D). For this theoretical example, it means that if we consider the common species (q≈1), the observed dissimilarity could be explained by random assembly of the two communities from the regional species pool but if we give equal weight to all species (q≈0), the observed dissimilarity is higher than we can expect from a random assembly process.

Effects of bioinformatics choices on count tables

Samples collected from two experiments (AGS and MFC) were sequenced in two separate sequencing runs. The sequences were processed using DADA2 version 1.10 [36], Deblur version 1.04 [37], USEARCH version 10 [38], and Mothur version 1.41 [39] with various settings, resulting in 11 count tables for each experiment (Text S3, supplementary material). In USEARCH, either the UNOISE pipeline was used to generate ASVs or UPARSE was used to generate OTUs with a 97% OTU-clustering threshold.

Fig. 2 shows information about the generated count tables, including how the sequences were processed and the numbers of reads and OTUs/ASVs. Sequence reads were processed sample-by-sample (separate) or by pooling the reads in all samples (pooled). Relaxed quality filtering thresholds resulting in a higher number of OTUs/ASVs (high) or stringent settings resulting in fewer OTUs/ASVs (low) were used in some of the pipelines. Deblur resulted in the lowest number of ASVs/OTUs in both the AGS- and MFC experiments and Mothur resulted in the highest. There was a large span in the number of inferred OTUs/ASVs for different pipelines. In the AGS experiment the number ranged from 690 to 4055 and in the MFC experiment the span was 1800 to 6457. During analysis of dissimilarities, all count tables were rarefied (without replacement) to the read count in the sample with the lowest number of reads. This resulted in count tables with 223 692 to 321 060 reads per sample in the AGS data set and 14 896 to 35 680 in the MFC data set (Fig. S1, supplementary material). The order of magnitude difference in reads per sample for the two data sets was caused by a higher number of total reads, a lower number of samples, and a more even sequencing depth in the AGS data set. Rarefaction curves for each count table are shown in Fig. S2-S3 (supplementary material). For count tables with stringent quality filtering thresholds and relatively few OTUs/ASVs, the rarefaction curves reach a horizontal asymptote whereas increasing numbers of OTUs/ASVs with increasing sampling depth is observed for some count tables.

For each data set, the DADA2 and UNOISE tables having the highest ASV counts were used to generate a consensus table, which only contained ASVs detected by both pipelines. A function for generating a consensus table from an unlimited number of count tables was implemented in qdiv and is described in Text S4 (supplementary material). For the AGS data set, the consensus table was inferred from count tables #3 and #6. They together had 2041 unique ASVs, but only 919 of these were found in both and used in the consensus table. For the MFC data set, count tables #1 and #6 were used as input for the consensus table. They together had 5952 unique ASVs out of which 2489 SVs were found in both and used in the consensus table. For both data sets, most of the reads were associated with the ASVs that were kept in the consensus tables (99.7% for the AGS data and 97.1% for the MFC data).

Assessing choice of pipeline and dissimilarity index by analysis of replicates

Both the AGS and MFC samples contained microbial community replicates, which means that DNA was extracted in parallel from six aliquots of biomass collected from the same microbial community (e.g. the same AGS reactor or the same MFC biofilm). The MFC samples also contained one set of technical replicates, which in this study means that the same DNA extract was processed in six separate PCR reactions followed by sequencing of the six separate PCR products.

The diversity order (q) of the dissimilarity index had a strong effect on the dissimilarity between replicates. The highest dissimilarity was observed for incidence-based indices (⁰d and Jaccard). The dissimilarity decreased with increasing diversity order. Overall, the technical replicates had lower dissimilarity than the community replicates (Fig. 3A). The difference between technical- and community replicates was statistically significant for 6 out of 12 count tables for ⁰d and 9 out of 12 tables for ¹d (p < 0.05, Welch’s anova with Holm-Bonferroni correction for multiple comparisons), but not for any of the count tables at ²d. Different count tables showed different dissimilarity between community replicates. For the AGS samples, the Mothur table (#11) showed significantly higher dissimilarity than all the other tables for ⁰d, and the UPARSE separate table (#10) showed significantly higher dissimilarity than all the others for ¹d (p < 0.05). For ²d, there was no significant difference between the tables (Fig. 3B-D). The consensus table had lower dissimilarity between replicates than the two count tables used to generate the consensus table. In the AGS data set, the ⁰d between replicates dropped from 0.19±0.03 and 0.14±0.01 with count tables #3 and #6, to 0.12±0.01 with the consensus table (p < 0.05). The dissimilarity between replicates for both the AGS- and MFC data sets and several dissimilarity indices and pipelines are shown in Fig. S4-S5 (supplementary material).

Dissimilarity matrices generated with different pipelines and indices

Having seen differences in the number of OTUs/ASVs and read counts (Fig. 2), as well as in the dissimilarity between replicates for count tables generated with different bioinformatics pipelines (Fig. 3), we asked if dissimilarity matrices calculated from the count tables would show the same patterns in the data. Matrices of pairwise dissimilarities between samples are typically used to explore microbial communities. First, we investigated the similarity of dissimilarity matrices generated using different indices and count tables. The matrices were compared using the Mantel test with the Pearson correlation coefficient (r) or the Spearman rank correlation coefficient (ρ) as test statistic. For all pairwise comparisons between dissimilarity matrices, there was a statistically significant correlation (p=0.001, 999 permutations), and the correlation coefficients ranged from 0.70 to 1.00. However, although the dissimilarity matrices were highly correlated with each other, there were some differences. To visualize these differences, a principal coordinate analysis (PCoA) of the dissimilarities (measured as 1-r) between dissimilarity matrices was carried out (Fig. 4). Each point in Fig. 4 represents a dissimilarity matrix calculated from a rarefied count table. The matrices tended to separate by index. Incidence-based indices (⁰d and Jaccard) were more scattered and clearly separated from relative-abundance based indices (¹d, ²d, Bray-Curtis). For the AGS samples, the count tables generated by Mothur also separated from the other count tables for the incidence-based indices.

Resolution power of different count tables and dissimilarity indices

The ability of different dissimilarity indices and count tables to distinguish between sample groups in the experimental data was also tested. The AGS data set was more challenging than the MFC data set because most taxa were shared between different samples. Therefore, the AGS data set with the three sample categories, the inoculum, reactor 1 (R1), and reactor 2 (R2), was used in the analysis. The F-statistic is the ratio of between-group variability and within-group variability. Dissimilarity matrices resulting in the calculation of a high F-statistic are thus better at resolving differences between sample groups. Fig. 5 shows that dissimilarity matrices generated with the ¹d and ²d indices resulted in higher F-statistic than those generated with the Bray-Curtis index, which in turn resulted in higher F-statistic than those generated with the incidence-based indices. High dissimilarity between replicates, which was observed for the incidence-based indices (Fig. 3), would result in lower F-statistic. Despite large differences in the F-statistic, statistically significant separation between the three sample groups was found with all count tables and dissimilarity indices (permanova, p=0.001, 999 permutations). An example of a PCoA showing the separation between sample groups is shown in Fig. S6 and F-statistics for all pipelines are shown Fig. S7 (supplementary material).

Effects of pipelines and indices on hypothesis testing

AGS experiment

In the AGS experiment, we hypothesized that R1 and R2 diverged from the inoculum to the same extent after 150 days of operation since they were operated under identical condition. Thus, the dissimilarity between the inoculum and R1 should be the same as between the inoculum and R2. The results based on two count tables are shown in Fig. 6. For high diversity orders (q ≥ 0.4), both show larger dissimilarity between the inoculum and R2 than between the inoculum and R1. For low diversity order, count table #3 (Fig. 6A) still shows the same results whereas count table #7 (Fig. 6B) shows higher dissimilarity between the inoculum and R1. The results from all count tables with Hill-based, Jaccard, and Bray-Curtis dissimilarity indices are shown in Fig. S8-S9 (supplementary material). All count tables showed that if q ≥0.6, the inoculum had significantly higher dissimilarity with R2 than with R1 (p < 0.05, Welch’s anova). The Bray-Curtis dissimilarity showed the same. However, the results varied with the Jaccard index and with ^q<0.6d.

MFC experiment

In the MFC experiment, we compared microbial communities of biofilms growing on anodes (conductive surface functioning as electron acceptor) with biofilms growing on non-conductive porous separators. We hypothesized that biofilms growing on conductive and non-conductive surfaces would be more dissimilar to each other in the acetate-fed MFC than in the glucose-fed MFC. Glucose is a fermentable substrate and fermentative microorganisms should be able to grow anywhere within the MFCs, leading to a more homogenous microbial community structure. Acetate, on the other hand, is non-fermentable and the microbial communities in an acetate-fed MFC are therefore dependent on electron acceptor availability. On the anode surface, the anode serves as electron acceptor while in other locations within the MFCs, the microorganisms must use soluble compounds such as oxygen diffusing in through the gas-diffusion cathode. Microbial communities in different locations of the acetate-fed MFCs should therefore have different metabolisms, which likely leads to higher dissimilarity than between communities within the glucose-fed MFCs which, at least partly, could have the same metabolism, namely fermentation [28].

For high diversity orders, (q > 0.8), most count tables showed higher dissimilarity in the acetate-fed MFC than in the glucose-fed MFC. For low diversity orders, the glucose-fed MFC had higher dissimilarity (Fig. S10-S11, supplementary material). For the MFC data set, count tables generated with different bioinformatics pipelines showed quite different results. Particularly tables #1 and #2 generated with DADA2 run in pooled mode had much lower difference between the two types of MFCs than the other count tables. An example is shown in Fig. 7.

Null model

Null models were used to aid in the interpretation of dissimilarity values. An example from the AGS experiment is shown in Fig. 8A-C. The dissimilarity between the inoculum and R1 is not significantly different from the null distribution at any diversity order and consequently ^qRC is close to 0.5. For the inoculum and R2, the observed dissimilarity is higher than the null expectation, showing a higher degree of compositional changes in the microbial community.

For the MFC data set, we choose to run the null model for two count tables because the dissimilarity between the two types of biofilms in the acetate-fed MFC had a different shape when the results were based on count tables #1 and #2 compared to the other count tables (Fig. S10, supplementary material). The results from the null model analysis are shown in Fig. 8D-I. At a diversity order of 0, the observed dissimilarity is similar to the null expectation and consequently ^qRC is close to 0.5. This indicates that if we only care about detection of ASVs, there is a random distribution between the two biofilm communities. With increasing emphasis on relative abundance, the dissimilarity between biofilm types is higher than the null distribution. For the acetate-fed MFCs, the ^qRC values are close to 1, which means significant compositional differences between the two communities. For the glucose-fed MFCs, the ^qRC again drops to lower values at a diversity order above 1. This means that some of the most abundant ASVs are shared between biofilms growing on conductive and non-conductive surfaces. This indeed turned out to be the case with a Trichococcus sp. being highly abundant in both biofilm communities, likely carrying out fermentation in both places [28].

Comparison of count tables

Previous studies comparing bioinformatics pipelines for high-throughput sequencing of marker-genes have found large differences in alpha diversity estimates [40, 41]. We also observed that both the pipeline and the input parameter values chosen by the user affected the number of inferred OTUs/ASVs as well as the number of reads mapped to these. With real samples of unknown composition, it is difficult to choose which pipeline and which settings to use for the analysis. A way to approach the problem of inflated OTU/ASV counts is to infer a consensus table based on OTU/ASVs detected using several different pipelines. We have implemented an algorithm for doing this in qdiv (Text S4, supplementary material). Running the algorithm on the DADA2 and UNOISE count tables having the highest ASV counts resulted in dramatic drops in ASV count in the consensus table; however, most of the reads (97.1-99.7%) were associated with the consensus ASVs.

Dissimilarity between replicates

Dissimilarity between replicates can be caused by many factors associated with sampling, DNA extraction, PCR, sequencing, and data processing [42]. The comparison between community- and technical replicates in Fig. 3A suggested that only a relatively small fraction is associated with sampling and DNA extraction for the case of an MFC biofilm sampled from an anode. High dissimilarity between replicates can make it difficult to use marker-gene amplicon sequencing to distinguish groups of samples. For example, Bautista-de los Santos et al. [43] studied microbial communities in drinking water using the Jaccard and Bray-Curtis indices on an OTU table generated with Mothur. Fewer significant differences between sample groups were observed with the Jaccard index because of high dissimilarity between replicate samples [43]. We also observed much lower F statistics with incidence-based dissimilarity indices (Fig. 5), which was caused by higher dissimilarity between community replicates in relation to dissimilarity between sample groups. The dissimilarity between replicates for the incidence-based indices could be lowered somewhat by generating a consensus table. This also led to an increased F-statistic when sample groups were compared (Fig. S7, supplementary material). With incidence-based and low diversity order indices, OTUs/ASVs with very low relative abundance can have a high impact on the dissimilarity values. By generating a consensus table, many low-abundant and potentially spurious ASVs were dropped from the data set.

The dissimilarity between replicates decreased with increasing diversity order until q was approximately one. For some samples, most notably the biofilm samples from non-conductive surfaces in the MFC experiment, the dissimilarity between replicates then increased at higher diversity order (i.e. q=2) and for the Bray-Curtis index (Fig. S5, supplementary material). Dissimilarity between replicates could be caused by random sampling effects [5] and generation of erroneous OTUs/ASVs during PCR, sequencing and data processing. This would affect the detection/non-detection of low-abundant OTUs/ ASVs, which has a strong influence on the incidence-based indices. At a high diversity order, the calculated dissimilarity is highly dependent on the relative abundance of the most abundant OTUs/ASVs in each sample. Small differences in relative abundance values of those OTUs/ASVs are amplified, which leads to increasing dissimilarity. The ¹d index, which weighs OTUs/ASVs exactly according to their relative abundance in the sample, seems to be a good compromise leading to low dissimilarity between replicates and hence better possibilities of detecting actual differences between samples groups exposed to different treatments.

Hill-based indices for hypothesis testing

Dissimilarity matrices generated with different indices and count tables differed from each other (Fig. 4), but they were all able to distinguish between samples belonging to different categories, such as the inoculum, R1, and R2 sample groups in the AGS data set. However, when specific hypotheses comparing the dissimilarities between different groups of samples were tested, the results varied. The results demonstrated the importance of carrying out analyses with several dissimilarity indices.

Previous research has shown that Hill numbers are suitable for quantifying alpha diversity in samples obtained by high-throughput sequencing of marker-genes [26]. For example, Haegeman et al. [44] analyzed alpha diversity as a function of diversity order and concluded that Hill numbers with q > 1 give robust estimates of alpha diversity. In this study, we show that dissimilarity profiles, which show the dissimilarity between samples as a function of diversity order (Fig. 6-7), are highly informative also in the study of beta diversity. The use of a single dissimilarity index would have given misleading information for the data sets investigated in this study. In the AGS experiment, incidence-based indices gave conflicting results for different count tables. For some count tables, the dissimilarity between the inoculum and R1 was larger than the dissimilarity between the inoculum and R2, some count tables showed the opposite, and some did not show a difference. Had we used only an incidence-based index, we might have concluded that R1 and R2 were about equally dissimilar to the inoculum, which was in line with our hypothesis for the AGS experiment. However, at higher diversity order, there was a clear difference between R1 and R2. In the MFC experiment, the incidence-based indices would have led us to conclude that the dissimilarity between biofilms on conductive and non-conductive surfaces in the acetate-fed MFCs was lower than in the glucose-fed MFCs, contrary to our hypothesis. However, when we plot dissimilarity as a function of q, we see that when we focus on the “common” OTUs/ASVs, the bioanodes and biofilms in the glucose-fed MFCs are in fact less dissimilar, in line with our hypothesis.

Contrary to the commonly used Bray-Curtis index, the Hill-based dissimilarity indices have an intuitive interpretation. The ^qd index quantifies the average proportion of OTUs/ASVs in one sample not shared with the other sample [24]. If two samples have S number of equally common species and C species are shared, the dissimilarity value would be 1-C/S [23]. Thus, the number itself has a meaning. For example, ⁰d can be interpreted as the average proportion of all OTUs/ASVs-, ¹d as the average proportion of “common” OTUs/SVs-, and ²d as the average proportion of “abundant” OTUs/ASVs not shared between two samples.

Null models help us to further interpret the meaning of the dissimilarity values. The data set from the MFCs show that for a diversity order of 0, the distribution of OTUs/ASVs between the two types of biofilms is close to the null expectation. This is logical considering that the two biofilms are physically located close to each other and linked by dispersal. There is, thus, a high likelihood that the same OTUs/ASVs can be detected in both locations, even if they do not grow in both locations. For higher diversity order (i.e. q = 1) we see a higher dissimilarity than the null expectation, suggesting that the common OTUs/ASVs are different in the two locations. This could be explained by heterogeneous selection. The conductive anode surface selects for electroactive microorganisms whereas the non-conductive separator selects for oxygen scavengers. For even higher diversity order (q = 2), the dissimilarity between the two biofilms in the glucose-fed MFC again approaches the null expectation. This logical considering the one of the most abundant taxa in the glucose-fed MFCs was a fermentative Trichococcus sp., which could grow in both locations [28].

Bioinformatics pipelines ran with different settings resulted in count tables having large differences in the number of OTUs/ASVs and total reads.
A way to minimize the effect of low-abundant and possibly spurious OTUs/ASVs on the analysis is to generate a consensus table based on several other count tables generated using different denoising pipelines (e.g. UNOISE, DADA2, and Deblur).
Conclusions drawn from experimental data can depend on the chosen dissimilarity index. To fully understand beta diversity patterns, Hill-based dissimilarity values should be calculated for several diversity orders (q). Dissimilarity profiles plotting ^qd as a function of q are informative when pairwise dissimilarities between groups of samples are of interest.
Null models, which can be calculated based on all dissimilarity indices, help in the interpretation of dissimilarity values and give information about community assembly mechanisms.
The Python package qdiv, freely available at https://github.com/omvatten/qdiv with documentation at https://qdiv.readthedocs.io/en/latest/, enables simple calculation of all Hill-based dissimilarity indices and associated null models.

Experimental

Samples collected from two separate experiments were analyzed in this study. In the AGS experiment, granular sludge from a sequencing batch reactor was used to inoculate two new reactors (R1 and R2). Six samples were collected from the inoculum as well as from each of the two new reactors after 150 days of operation (Fig. S12, supplementary material). The sets of six are called community replicates.

In the MFC experiment, parallel MFCs were operated with either acetate or glucose as the sole electron donor [for details, see 28]. Samples were collected from the anode where a biofilm of electroactive microorganisms oxidized the electron donor and generated electrical current, and from a non-conductive porous separator where a biofilm oxidized or fermented the electron donor and scavenged oxygen (Fig. S13, supplementary material). In one acetate- and one glucose-fed MFC, the biofilm samples were each cut into six pieces and DNA was extracted and processed separately from each piece. These samples are called community replicates. The DNA extracted from one of the anode-attached biofilm samples was also processed in six separate PCR reactions. These samples are called technical replicates.

DNA was extracted using the FastDNA Spin Kit for Soil (MP Biomedicals). PCR amplification of the V4 region of the 16S rRNA gene was carried out with the primer pair 515’F (GTGBCAGCMGCCGCGGTAA) and 806R (GGACTACHVGGGTWTCTAAT) [45, 46] and the dual indexing strategy by Kozich et al. [3]. High-throughput sequencing was carried out using the Illumina MiSeq platform and reagent kit V3 (2x300 bp paired-end sequencing). Further details are provided in Text S5 (supplementary material). The samples from the AGS and MFC experiments were processed in two separate sequencing runs. The sequencing results were deposited in the European Nucleotide Archive with accession numbers PRJEB35721 (AGS data set) and PRJEB26776 (MFC data set).

Bioinformatics

The sequence reads were processed using DADA2 version 1.10 [36], Deblur version 1.04 [37], USEARCH version 10 [38], and Mothur version 1.41 [39]. The pipelines offer the user various choices. For example, the stringency of the quality filtering method can typically be varied, and the reads can often be processed either separately sample-by-sample or in pooled mode. Analysis of pooled samples requires more computer memory. DADA2 and Deblur generate ASVs whereas Mothur generate OTUs. USEARCH can either generate ASVs using UNOISE [47] or OTUs using UPARSE [48]. Several count tables were generated using various input parameter settings in the pipelines (Text S3, supplementary material). Details about the pipelines are provided at github.com/omvatten/amplicon_sequencing_pipelines. Based on the DADA2 and UNOISE count tables with the highest ASV counts, consensus tables consisting of ASVs found with both were generated using qdiv.

Software

A software, qdiv, allowing calculation of all the indices and null models mentioned above was developed in Python3 and is available as a Python package. It makes use of the following Python packages: pandas [49], numpy [50], matplotlib [51], and python-Levenshtein. The source code for qdiv is available at https://github.com/omvatten/qdiv. It is available via PyPI and the Anaconda cloud.

Statistical analysis

To compare dissimilarity matrices generated using different pipelines and indices, the Pearson correlation coefficient (r) and the Spearman rank-correlation coefficient (ρ) were used to quantify whether two dissimilarity matrices were correlated; i.e., if high dissimilarity values in one matrix corresponded to high values in the other matrix and vice versa. By subtracting r (or ρ) from one, a measure of the dissimilarity of the two matrices was obtained. The data was analyzed by ordination using PCoA and the statistical significance of the association between different dissimilarity matrices was quantified using Mantel’s permutation test [52]. To compare the variability within sample groups to the variability between samples groups, permanova was used [53]. Both the Mantel test and permanova were implemented in qdiv. Welch’s anova was carried out using SciPy [54].

Ethics approval and consent to participate

Not applicable

Consent to publication

Not applicable

Availability of data and materials

Amplicon sequence data are deposited at the European Nucleotide Archive under accession numbers PRJEB35721 (AGS data set) and PRJEB26776 (MFC data set).

Bioinformatics pipelines used to process the sequence data and generate count tables are available at https://github.com/omvatten/amplicon_sequencing_pipelines.

The code for qdiv, which was the software developed in this project and used to analyze the count tables is available at https://github.com/omvatten/qdiv.

Competing interests

The authors declare that they have no competing interests.

Funding

The project was funded by the Swedish Research Council (VR, grant 2012-5167) and the Swedish Research Council for Environment, Agricultural Sciences, and Spatial Planning (FORMAS, grant 2013-627 and grant 2018-01423).

Authors’ contributions

OM and SS operated the MFCs and generated the sequence data for that experiment. RL operated the AGS reactors and generated the sequence data for the experiment. OM developed the software and was the main author of the manuscript. All authors critically reviewed and approved the final manuscript.

Acknowledgements

The authors acknowledge the Genomics core facility at the University of Gothenburg for support and use of their equipment.

Locey KJ, Lennon JT: Scaling laws predict global microbial diversity. PNAS 2016, 113(21):5970-5975.
Caporaso JG, Lauber CL, Walters WA, Berg-Lyons D, Huntley J, Fierer N, Owens SM, Betley J, Fraser L, Bauer M et al: Ultra-high-throughput microbial community analysis on the Illumina HiSeq and MiSeq platforms. ISME Journal 2012, 6(8):1621-1624.
Kozich JJ, Westcott SL, Baxter NT, Highlander SK, Schloss PD: Development of a dual-index sequencing strategy and curation pipeline for analyzing amplicon sequence data on the MiSeq Illumina sequencing platform. Applied and environmental microbiology 2013, 79(17):5112-5120.
Aigle A, Prosser JI, Gubry-Rangin C: The application of high-throughput sequencing technology to analysis of amoA phylogeny and environmental niche specialisation of terrestrial bacterial ammonia-oxidisers. Environmental Microbiome 2019, 14(1):3.
Zhou J, Jiang Y-H, Deng Y, Shi Z, Zhou BY, Xue K, Wu L, He Z, Yang Y: Random Sampling Process Leads to Overestimation of β-Diversity of Microbial Communities. mBio 2013, 4(3):e00324-00313.
Fouhy F, Clooney AG, Stanton C, Claesson MJ, Cotter PD: 16S rRNA gene sequencing of mock microbial populations- impact of DNA extraction method, primer choice and sequencing platform. BMC microbiology 2016, 16(1):123-123.
Kembel SW, Wu M, Eisen JA, Green JL: Incorporating 16S Gene Copy Number Information Improves Estimates of Microbial Diversity and Abundance. PLOS Computational Biology 2012, 8(10):e1002743.
Gonzalez JM, Portillo MC, Belda-Ferre P, Mira A: Amplification by PCR Artificially Reduces the Proportion of the Rare Biosphere in Microbial Communities. PLOS ONE 2012, 7(1):e29973.
Schloss PD, Gevers D, Westcott SL: Reducing the effects of PCR amplification and sequencing artifacts on 16S rRNA-based studies. PLoS One 2011, 6(12):e27310.
Eren AM, Morrison HG, Lescault PJ, Reveillaud J, Vineis JH, Sogin ML: Minimum entropy decomposition: unsupervised oligotyping for sensitive partitioning of high-throughput marker gene sequences. ISME Journal 2015, 9(4):968-979.
Rosen MJ, Callahan BJ, Fisher DS, Holmes SP: Denoising PCR-amplified metagenome data. BMC Bioinformatics 2012, 13(283).
Tikhonov M, Leach RW, Wingreen NS: Interpreting 16S metagenomic data without clustering to achieve sub-OTU resolution. ISME Journal 2015, 9(1):68-80.
García-García N, Tamames J, Linz AM, Pedrós-Alió C, Puente-Sánchez F: Microdiversity ensures the maintenance of functional microbial communities under changing environmental conditions. The ISME journal 2019.
He Y, Caporaso JG, Jiang X-T, Sheng H-F, Huse SM, Rideout JR, Edgar RC, Kopylova E, Walters WA, Knight R et al: Stability of operational taxonomic units: an important but neglected property for analyzing microbial diversity. Microbiome 2015, 3(1):20.
Callahan BJ, McMurdie PJ, Holmes SP: Exact sequence variants should replace operational taxonomic units in marker-gene data analysis. ISME Journal 2017.
Koleff P, Gaston KJ, Lennon JJ: Measuring beta diversity for presence–absence data. Journal of Animal Ecology 2003, 72(3):367-382.
Barwell LJ, Isaac NJB, Kunin WE: Measuring β-diversity with species abundance data. Journal of Animal Ecology 2015, 84(4):1112-1122.
Porter TM, Hajibabaei M: Scaling up: A guide to high-throughput genomic approaches for biodiversity analysis. Molecular ecology 2018, 27(2):313-338.
Escolà Casas M, Nielsen TK, Kot W, Hansen LH, Johansen A, Bester K: Degradation of mecoprop in polluted landfill leachate and waste water in a moving bed biofilm reactor. Water Research 2017, 121:213-220.
Kunin V, Engelbrektson A, Ochman H, Hugenholtz P: Wrinkles in the rare biosphere: pyrosequencing errors can lead to artificial inflation of diversity estimates. Environmental microbiology 2010, 12(1):118-123.
Hill MO: Diversity and evenness: A unifying notation and its consequences. Ecology 1973, 54(2):427-432.
Jost L: Entropy and diversity. OIKOS 2006, 113(2):363-375.
Jost L: Partitioning diversity into independent alpha and beta components. Ecology 2007, 88(10):2427-2439.
Chao A, Chiu C-H, Jost L: Unifying species diversity, phylogenetic diversity, functional diversity, and related similarity and differentiation measures through Hill numbers. Annu Rev Ecol Evol Syst 2014, 45:297-324.
Ellison AM: Partitioning diversity. Ecology 2010, 91(7):1962-1963.
Kang S, Rodrigues JL, Ng JP, Gentry TJ: Hill number as a bacterial diversity measure framework with high-throughput sequence data. Scientific reports 2016, 6:38263.
Ma Z: Measuring Microbiome Diversity and Similarity with Hill Numbers. In: Metagenomics. Edited by Nagarajan M: Academic Press; 2018: 157-178.
Saheb-Alam S, Persson B, Wilén B-M, Hermansson M, Modin O: Response to starvation and microbial community analaysis in microbial fuel cells enriched on different electron donors. Microbial Biotechnology 2019, 12(5):962-975.
Liébana R, Modin O, Persson F, Szabó E, Hermansson M, Wilén B-M: Combined Deterministic and Stochastic Processes Control Microbial Succession in Replicate Granular Biofilm Reactors. Environmental Science & Technology 2019, 53(9):4912-4921.
Horner-Devine MC, Lage M, Hughes JB, Bohannan BJM: A taxa–area relationship for bacteria. Nature 2004, 432(7018):750-753.
Baselga A: Partitioning the turnover and nestedness components of beta diversity. Global Ecology and Biogeography 2010, 19(1):134-143.
Chase JM, Kraft NJB, Smith KG, Vellend M, Inouye BD: Using null models to disentangle variation in community dissimilarity from variation in α-diversity. Ecosphere 2011, 2(2):24.
Chase JM, Myers JA: Disentangling the importance of ecological niches from stochastic processes across scales. Philosophical Transactions of the Royal Society B 2011, 366(2351-2363).
Raup DM, Crick RE: Measurement of faunal similarity in paleontology. Journal of Paleontology 1979, 53(5):1213-1227.
Stegen JC, Lin X, Fredrickson JK, Chen X, Kennedy DW, Murray CJ, Rockhold ML, Konopka A: Quantifying community assembly processes and identifying features that impose them. The ISME journal 2013, 7(11):2069-2079.
Callahan BJ, McMurdie PJ, Rosen MJ, Han AW, Johnson AJ, Holmes SP: DADA2: High-resolution sample inference from Illumina amplicon data. Nature methods 2016, 13(7):581-583.
Amir A, McDonald D, Navas-Molina JA, Kopylova E, Morton JT, Zech Xu Z, Kightley EP, Thompson LR, Hyde ER, Gonzalez A et al: Deblur Rapidly Resolves Single-Nucleotide Community Sequence Patterns. mSystems 2017, 2(2).
Edgar RC: Search and clustering orders of magnitude faster than BLAST. Bioinformatics 2010, 26(19):2460-2461.
Schloss PD, Westcott SL, Ryabin T, Hall JR, Hartmann M, Hollister EB, Lesniewski RA, Oakley BB, Parks DH, Robinson CJ et al: Introducing mothur: Open-source, platform-independent, community-supported software for describing and comparing microbial communities. Applied and environmental microbiology 2009, 75(23):7537-7541.
Allali I, Arnold JW, Roach J, Cadenas MB, Butz N, Hassan HM, Koci M, Ballou A, Mendoza M, Ali R et al: A comparison of sequencing platforms and bioinformatics pipelines for compositional analysis of the gut microbiome. BMC Microbiol 2017, 17(1):194.
Nearing JT, Douglas GM, Comeau AM, Langille MGI: Denoising the Denoisers: an independent evaluation of microbiome sequence error-correction approaches. PeerJ 2018, 6:e5364.
Zinger L, Bonin A, Alsos IG, Bálint M, Bik H, Boyer F, Chariton AA, Creer S, Coissac E, Deagle BE et al: DNA metabarcoding—Need for robust experimental designs to draw sound ecological conclusions. Molecular ecology 2019.
Bautista-de los Santos QM, Schroeder JL, Blakemore O, Moses J, Haffey M, Sloan W, Pinto AJ: The impact of sampling, PCR, and sequencing replication on discerning changes in drinking water bacterial community over diurnal time-scales. Water Research 2016, 90:216-224.
Haegeman B, Hamelin J, Moriarty J, Neal P, Dushoff J, Weitz JS: Robust estimation of microbial diversity in theory and in practice. The ISME journal 2013, 7(6):1092-1101.
Caporaso JG, Lauber CL, Walters WA, Berg-Lyons D, Lozupone CA, Turnbaugh PJ, Fierer N, Knight R: Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample. P Natl Acad Sci USA 2011, 108:4516-4522.
Hugerth LW, Wefer HA, Lundin S, Jakobsson HE, Lindberg M, Rodin S, Engstrand L, Andersson AF: DegePrime, a program for degenerate primer design for broad-taxonomic-range PCR in microbial ecology studies. ISME Journal 2014, 5(10):1571-1579.
Edgar RC: UNOISE2: improved error-correction for Illumina 16S and ITS amplicon sequences. bioRxiv 2016:http://dx.doi.org/10.1101/081257.
Edgar RC: UPARSE: highly accurate OTU sequences from microbial amplicon reads. Nature methods 2013, 10(10):996-998.
McKinney W: Data Structures for Statistical Computing in Python. Proceedings of the 9th Python in Science Conference 2010, 51-56.
Oliphant TE: A guide to NumPy. USA: Trelgol Publishing; 2006.
Hunter JD: Matplotlib: A 2D Graphics Environment. Computing in Science & Engineering 2007, 9:90-95.
Mantel N: The Detection of Disease Clustering and a Generalized Regression Approach. Cancer Research 1967, 27(2 Part 1):209-220.
Anderson MJ: A new method for non-parametric multivariate analysis of variance. Austral Ecol 2001, 26(1):32-46.
Virtanen P, Gommers R, Oliphant TE, Haberland M, Reddy T, Cournapeau D, Burovski E, Peterson P, Weckesser W, Bright J et al: SciPy 1.0--Fundamental Algorithms for Scientific Computing in Python. arXiv:190710121 2019.

Supplementary200213.pdf

Download PDF

Journal Publication

published 11 Sep, 2020

Read the published version in Microbiome →

Version 1

posted

You are reading this older preprint version

Read the latest preprint version →

Hill-based dissimilarity indices and null models for analysis of microbial community assembly

Status:

Journal Publication

Version 1

Abstract

Figures

Background

Results

Discussion

Conclusions

Methods

Declarations

References

Supplementary Files

Status:

Journal Publication

Version 1