Multi-omic Characterization of the Taxa-function Relationship in Infant Gut Microbiomes


 BackgroundThe infant intestinal microbiome plays an important role in metabolism and immune development with impacts on lifelong health. The linkage between the taxonomic composition of the microbiome and its metabolic phenotype is undefined and complicated by redundancies in the taxon-function relationships within microbial communities. To inform a more mechanistic understanding of the relationship between the microbiome and health, we performed an integrative statistical and machine learning-based analysis of microbe taxonomic structure and metabolic function in order to characterize the taxa-function relationship in early life.ResultsStool samples collected from infants enrolled in the New Hampshire Birth Cohort Study (NHBCS) at approximately 6-weeks (n = 168) and 12-months (n = 282) of age were profiled using targeted and untargeted nuclear magnetic resonance (NMR) spectroscopy as well as DNA sequencing of the V4-V5 hypervariable region from the bacterial 16S rRNA gene. There was significant inter-omic concordance based on Procrustes analysis of sample distances (6 weeks: p = 0.056; 12 months: p = 0.001), however this association was no longer significant when accounting for phylogenetic relationships using generalized UniFrac distance metric (6 weeks: p = 0.376; 12 months: p = 0.069). Sparse canonical correlation analysis showed significant correlation, as well as identifying sets of microbe/metabolites driving microbiome-metabolome relatedness. Performance of machine learning models varied across different metabolites, with support vector machines (radial basis function kernel) being the consistently top ranked model. However, predictive R2 values demonstrated poor predictive performance across all models assessed (avg. R2: -5.06% -- 6 weeks; -3.7% -- 12 months). Conversely, the Spearman correlation metric was higher (avg. correlation: 0.344 – 6 weeks; 0.265 – 12 months). This demonstrated that taxonomic relative abundance was not predictive of metabolite concentrations. ConclusionsOur results suggest a degree of overall association between taxonomic profiles and metabolite concentrations. However, lack of predictive capacity for stool metabolic signatures reflects, in part, the possible role of functional redundancy in defining the taxa-function relationship in early life as well as the bidirectional nature of the microbiome-metabolome association. Our results provide evidence in favor of a multi-omic approach for microbiome studies, especially those focused on health outcomes.


55
The human gut microbiome is a complex and diverse system of microorganisms that co-inhabit the intestinal 56 lumen and play a crucial role in modulating human health and disease [1,2]. The development of the 57 microbiota in early life is a sensitive process akin to primary ecological succession [3], and therefore both 58 reliant on and vulnerable to external perturbations. Studies have linked microbiome alterations to long-term 59 health consequences, including risk of obesity [4], type I diabetes [5], and inflammatory bowel disease [6]. As 60 such, there is a need to understand how the microbiome participates in the multifactorial pathways leading to 61 long-term disease outcomes. One key to this open question lies in the currently undefined relationship between 62 the taxonomic structure of the microbiome and its metabolic phenotype. Previous studies addressing this 63 question have mainly focused on DNA-based profiling of microbial functional potential, which, due to 64 complicated regulatory mechanisms at the cellular level beyond the genome, is not equivalent to the 65 microbiota's realized functional landscape [7]. 66 There exists a bidirectional association between the metabolome and the microbiome in the gut [8,9]. These 67 low molecular weight molecules can either be nutrients that shape the composition of the microbiome [10] or 68 important byproducts of host-microbe nutrient co-metabolism that help regulate host metabolic homeostasis 69 4 [11 -13]. For example, members of the Clostridium clusters can produce and increase serum levels of 70 branched chain amino acids, which are markers for insulin resistance and diabetes [14,15]. However, studies 71 suggest that the fecal metabolome specifically can be used as a readout of gut microbe metabolic functions. 72 Zierer et al. [16] showed in a large cohort of adult females (n = 786) from the TwinsUK study that around 60% 73 of the fecal metabolome is associated with microbial composition, where on average, 67% of variance in the 74 metabolome can be explained by the microbiome. 75 Recent studies have integrated the metabolome in microbiome analyses of health outcomes, most notably 76 Lloyd et al. [17]  Here, we present our study investigating associations between microbe abundances assayed with 16S rRNA 89 sequencing and metabolomic profiles measured with 1 H NMR spectroscopy in a cohort of infants representing 90 a rural general population from the New Hampshire Birth Cohort Study (NHBCS). This is a unique 91 epidemiological cohort with multi-omic profiling of infant stool samples at multiple time points accompanied with 92 rich sociodemographic, dietary and health outcomes data [25]. Our study utilizes predictive modeling, 93 5 multivariate correlation methods and distance-based approaches to characterize the dynamic relationship 94 between the gut microbiome and the gut metabolome in early life. 95

96
The overall workflow and subject selection process are described in Figure 1. Primary analyses were 97 performed on paired microbiome-metabolome data sets on samples collected at approximately 6 weeks (N = 98 158) and 12 months (N = 282) of age (65 subjects have paired samples collected at both time points). In order 99 to take advantage of the most samples from this study, each time point was analyzed separately with 00 sensitivity analyses performed on sample pairs. After processing and filtering, we evaluated a final taxonomic 01 dataset of 48 genera in 6 weeks samples and 72 genera in 12 months samples. Metabolomic profiles were 02 represented as 208 unique untargeted spectral bins and a concentration-fitting method [26] was used to 03 acquire specific relative concentrations of 36 targeted metabolites. These metabolites were chosen based on a 04 literature search (Table S1) for compounds known to be associated with commensal gut microbes. Results 05 presented here will primarily feature the targeted dataset, with accompanying figures and tables for the 06 untargeted data set in the supplemental section. Figure 1 shows the overall workflow including the sample 07 selection process. 08

Study population 09
Study subject characteristics are summarized in Table 1 separately for both subjects providing samples at 6-10 week of age (n = 158) and 12-months of age (n = 282). Characteristic of our population, most infants are White 11 (97.5% among subjects contributing a 6-week sample; 95.4% among subjects contributing a 12-month 12 sample), delivered vaginally (6 weeks samples: 72.2%; 12 months samples: 70.9%) and did not have any 13 systemic antibiotic exposure during initial hospitalization following birth (6 weeks samples: 95.6%; 12 months 14 samples 97.2%). The average birth weight was also similar across subjects irrespective of the sample time 15 point, 3370 g (± 480) for infants contributing 6-week samples and 3430 g (±528) for infants contributing 12-16 month samples. Similarly, the average gestational age was 39.1 weeks (± 1.59) (6-week samples) and 39 17 weeks (± 1.7) (12-month samples). At the time of 6-week sample collection, 62% of infants had been 18 6 exclusively breastfed while by the time of 12-month sample collection, 59.2% of infants received breast milk 19 supplemented with formula, however a large minority (35.1%) remained exclusively breastfed. There were 20 more male than female infants in the cohort (53.8% male among infants contributing a 6-week sample; 56.4% 21 male among infants contributing a 12-month sample). Maternal smoking during pregnancy was rare (6-week 22 samples: 7%; 12-month samples: 5%).  Inter-omic sample distance comparison suggests overall concordance between data sets 30 Global concordance between the microbiome and the metabolome was observed across both time points and 31 metabolomic data sets (Figure 2A, Figure S1A) when analyzed using a symmetric Procrustes test with 32 samples ordinated by Euclidean distances (Methods). It is noted that the p-value at 6 weeks for the targeted 33 data set (p = 0.057) was only trending close to significant at the 0.05 level. 34 Since the Procrustes test was performed on principal coordinate (PCoA) ordinations of sample distances, the 35 result is sensitive to the choice of dissimilarity metric. In addition to standard Euclidean distances, the 36 generalized UniFrac (gUniFrac) metric was also leveraged to account for phylogeny in calculating differences 37 between samples. With gUniFrac, the association was not significant at either time points for the targeted data 38 set only ( Figure 2B), while the untargeted data set still maintained strong concordance (6 weeks samplesp = 39 0.001; 12 months samplesp = 0.006; Figure S1B). Euclidean distances of log-transformed metabolite profiles. There were significant associations between the 47 10 microbiome and the metabolome (both targeted and untargeted) when utilizing Euclidean distances, however 48 this association goes away when the gUniFrac distance was employed for the targeted metabolites only. 49 overall correlation pattern, where non-significant correlations are not colored (false discovery rate (FDR) 53 12 points. In the targeted data set Bifidobacterium was selected only at 6 weeks and Lactobacillus was only 79 selected at 12 months. Most notably, more microbes were selected at 12 months compared to 6 weeks in the 80 targeted data set, however in the untargeted data set this pattern was reversed ( Figure S2, right panels). In 81 terms of selected metabolites, the majority of the selected metabolites in the targeted data set were amino 82 acids (Supplementary Table 1), with some short chain fatty acids (SCFAs) selected at the 6-week time point. 83 Microbial community structure is weakly predictive of stool metabolite relative concentrations 84 In order to determine how well the fecal metabolome acts as a functional representation of the gut microbiome, 85 we fitted metabolite-specific prediction models based on taxonomic profiles. Chosen models include random 86 forest (RF), elastic net (EN), support vector machines with radial basis kernel (SVM-RBF) and sparse partial 87 least squares (SPLS), all of which had previously been shown to work well with microbiome-associated 88 learning tasks [28]. Evaluation was based on predicted R-squared (R 2 ) and Spearman correlation coefficient 89 (SCC) as measured using 100 repeats of 5-fold nested cross validation (Methods). 90 Predictive performance was more dependent on the metabolite being predicted than by choice of model 91 5.6% at 6 weeks; -3.07% at 12 months), which indicated that for most prediction tasks the fitted model was 94 less predictive than a naïve, intercept only model. At 6 weeks 22.2% of metabolites had models that perform 95 significantly better than the null (lower bound of 95% credible interval larger than 0) while at 12 months 38.9% 96 of metabolites fit the classification. However, even among such metabolites, the maximum R 2 is only 11.8% at 97 6 weeks and 8.7% at 12 months. Conversely, SCC values were higher in comparison (cross-metabolite avg.: 98 0.339 at 6 weeks and 0.249 at 12 months) (Figure 4 panel B, Supplementary Notes 3). At 6 weeks, 83% of 99 metabolites were significantly more performant than the null, while at 12 months all metabolites were selected. 00 Using a more stringent cutoff as used by Mallick et al. [29], the majority of metabolites at 6 weeks (69.4% of 01 total metabolites) still remained as well predicted while conversely at 12 months only 38.9% (of total 02 metabolites) were predictable. 03 13 Results from the untargeted analysis showed higher performance values for both evaluation metrics 04 (Supplementary Note 3). Specifically, 56.7% of metabolites bins at 6 weeks and 42.7% of bins at 12 months 05 had R 2 values significantly higher than 0. However, under SCC, while 57% of metabolite bins at 6 weeks had 06 SCC values significantly larger than 0.3 cutoff, only 28.8% of metabolite bins at 12 months fit this criterion. 07 Despite better performance, the overall average values were still low, suggesting that across the entire 08 metabolome few metabolites were highly predictable. 09 Despite weak predictive performance values, we were still interested in determining a model that stands out as 10 the most appropriate for our prediction task. Aggregating performance across metabolites stratified by model 11 for both evaluation metrics ( Figure 5, top panel), it can be observed that the average performances were 12 similar (Supplementary Notes 3), for which no semi-targeted analyses performed better on average than the 13 naive model under R 2 . This is further illustrated when model performance was aggregated by rank using Borda 14 scores ( Figure 5, bottom panel). A higher score indicated that a model was selected as the top choice more 15 times than others, where an even score distribution across models corroborated the suggestion that no model 16 was best across all prediction tasks. That said, SVM-RBF seemed to be the highest scoring model, particularly 17 for the 6-week time point. The untargeted analysis also found similar results ( Figure S3). 18

Sensitivity analyses 19
We performed both Procrustes and correlation analyses on a data set restricted to the 65 subjects with paired 20 samples collected at both time points (6 weeks and 12 months). Each time point was analyzed separately as in 21 our main analysis. In the targeted data set, significant Procrustes concordance was observed at 12 months (p-22 value = 0.003) but not at 6 weeks (p-value = 0.106). This association was no longer significant when 23 considering taxonomic ordination using the gUniFrac distance metric (6 weeks). Surprisingly, in the untargeted 24 data set, no association was observed across both time points and choice of distance metric ( Figure S5, S6). 25 In the canonical correlation analyses, significance was only observed in the targeted data set at 6 weeks only 26 increasing signal strength and model saturation, which demonstrated that prediction models were able to 37 capture predictive associations should they arise even in sparse microbiome data sets. Most notably, 38 simulation performance differed more by signal-to-noise ratio than model saturation, which indicated that the 39 strength of association plays a larger role in the observed weak predictive performance than the number of 40 taxa involved. Surprisingly, we obtained very similar results to our real data values under our lowest simulation 41 setting (model saturation = 5%; signal-to-noise ratio 0.5). As such, it can be suggested that the lack of 42 predictability is due to weak coupling rather than model choice. 43 Bottom panels show aggregated model rankings for all metabolites using R-squared (left) and Spearman 57 correlation (right) using Borda scores (Methods). Higher scores indicate that a model was consistently selected 58 as a better performing. Relatively similar Borda scores and cross-metabolite average predictive performances 59 indicate that no model was clearly the most performant. However, support vector machines (with radial basis 60 function kernel) was highest scoring model. 61

62
In this study, we analyzed the relationship between fecal microbial taxonomic abundances and metabolite 63 concentrations with multi-omic profiling via paired targeted sequencing of the 16S rRNA gene and H 1 NMR 64 metabolomics at multiple time points. Ecological, statistical and machine learning approaches were applied to 65 provide a multi-faceted view of this association. To our knowledge, this study is one of the few comprehensive 66 investigations addressing the microbiome/metabolome interface in a large general population cohort of infants. 67 The microbiome is significantly correlated but weakly predictive of the metabolome 68 Overall global concordance was found from three independent methods (Procrustes analysis, SCCA and 69 univariate Spearman correlation), consistent with previous studies on both infant [19,24] and adult populations 70 [17,30]. This overall effect was found at both time points, suggesting there coupling exists throughout infancy 71 even when there is a high inter-and intra-individual variability in taxonomic compositions [27]. 72 Although the microbiome and the metabolome were significantly correlated in our analyses, most metabolites 73 were not predictable across chosen machine learning models when evaluated by R 2 statistic. Even among the 74 small number of metabolites that are significantly predictable compared to the null model, the maximum 75 performance values were still low for both the untargeted and targeted analyses. Since all fitted models had 76 been shown to be suited for microbiome-associated prediction tasks [28,31] as well as covering both non-77 linear and linear associations, the lack of predictability is unlikely due to model choice. Our data driven 78 simulations further supported this assessment, where low predictability by R 2 was observed when there is low 79 18 signal-to-noise ratio. However, when evaluated using SCC, the models seem to perform better. Most of the 80 metabolome (57% of metabolite bins) were significantly well predicted at SCC cut off criteria of 0.3. This result 81 is similar to Mallick et al. [29], which also performs microbiome to metabolome prediction using data from a 82 cohort of adults. The study reported 53.8% of metabolites classified as well-predicted under the elastic net 83 model with the same cutoff. Despite similar percentages of the metabolome being well predicted, the maximum 84 cross-model average correlation value is 0.647 at 6 weeks and 0.431 at 12 months, which is lower than those 85 seen in Mallick et al. Low R 2 compared to SCC demonstrated that even though models were able to anticipate 86 to a certain degree the trends in metabolite abundances, they were not able to accurately predict the exact 87 concentrations. 88 Even though previous research suggests that the fecal metabolome is a functional representation of the gut 89 microbiome [16], metabolites are often resultant of both host and microbial processes [11,13]. This means for 90 each metabolite, there are host-associated components that drive measured concentrations [32]. We 91 hypothesize that metabolites that are more predictable are those which are associated with microbe-specific 92 metabolism. As such, predictability can be a marker to identify whether a metabolite is host or microbiome 93 driven. 94 However, poor overall performance even amongst predictable metabolites suggests the possible role of other 95 ecological processes. We hypothesized functional redundancy, an aspect ubiquitous in microbial communities 96 [33], plays an important role in this weak coupling. Under functional redundancy, high variability in taxonomic 97 profiles would correspond to little or no change in metabolite abundances, making it difficult to form accurate 98 predictions. Functional redundancy is usually a marker for ecosystem resilience [34]. This is evidenced when 99 Procrustes analyses were adjusted for phylogenetic relatedness using the gUniFrac distance metric where 00 inter-omic association in the targeted data set is no longer significant compared to using Euclidean distances. 01 gUniFrac adjusts for phylogeny by weighting the differences in proportions of each taxa across two samples by 02 the branch length from constructed evolutionary trees [35]. Therefore, the distance between two samples might 03 be larger under gUniFrac if the contributing taxa are evolutionarily divergent. This suggests that samples with 04 19 similar metabolic profiles might be numerically comparable (cluster together under Euclidean distances) but 05 with evolutionarily divergent taxonomic compositions. This supports the hypothesis that phylogenetically distant 06 gut microbes can replace each other while performing similar roles, a hallmark of functional redundancy [33, 07 36]. Since this was only observed in the targeted data set, we contend that the effect of redundancy is 08 localized to certain metabolites and is less pronounced globally. Functional redundancy is also consistent with 09 previous research in human associated microbiomes [37]. 10 Taxa and metabolites selected to be core to the microbiome-metabolome correlation reveal the 11 importance of amino acid metabolism. 12 Taxa and metabolites with non-zero loading coefficients in SCCA analyses were identified as factors driving 13 this overall correlation. The SCCA procedure utilized a L 1 -penalized matrix decomposition of the cross-product 14 matrix akin to a LASSO regression problem [38], which means that variables were selected based on their 15 importance to the overall covariance between taxa and metabolite abundances. 16 In the targeted data set, Firmicutes, Actinobacteria and Proteobacteria were the most represented phyla  [48][49][50], which explains the selection of the Bifidobacterium genera at 6 weeks where 31 infants are exclusively on a milk-based diet. The selection of SCFA-associated bacteria further demonstrates 32 the functional role that SCFA play in the interface between the microbiome and the host [42]. This is further 33 supported by the fact that two SCFAs in our targeted list (butyrate and propionate) were selected as important 34 variables in SCCA analyses. 35 Despite the importance of SCFAs, the most selected metabolites are of the amino acids class (7 out of 10 36 metabolites selected at 6 weeks were amino acids). Although microbes can catabolize amino acids for energy 37 [51], this reaction is not energetically efficient [10]. However, microbes will process amino acids in 38 environments with low carbohydrate availability, such as in the distal colon [9, 52], where one of the end point 39 products is SCFAs. This is further supported by the fact that members of the Peptostreptococcaceae family 40 (most represented family of selected members of the Clostridia class at 6 weeks) has been suggested to be 41 core to the metabolism of amino acids to SCFAs [51]. However, looking across both time points in our analysis, 42 negative correlation with amino acids did not correspond to a positive association with SCFAs (Figure 3), 43 suggesting that amino acid depletion in the fecal metabolome is most likely either to conserve resources from 44 reducing amino acid biosynthesis [10], generate energy in a low carbohydrate environment, or reduce toxic 45 byproducts of protein catabolism [53] rather than to produce SCFAs. However, there could be a possibility that 46 SCFA was produced but rapidly absorbed, which could not be captured by cross-sectional fecal metabolomics. 47 The microbiome is more tightly coupled with the metabolome in early infancy 48 Results suggest some level of significant difference in microbiome-metabolome coupling across development. 49 Canonical correlation, while not significantly different, were lower at 12 months than at 6 weeks, suggesting a 50 time-varying effect. When looking at predictability, we observed a higher number of well predicted metabolites 51 at 6 weeks compared to 12 months. Among those selected as well predicted metabolites, the average 52 performance values (both R 2 and SCC) where higher. This is also replicated in the global untargeted data set. 53

21
There are various factors that can contribute to the difference in microbiome-metabolome coupling between 54 infants at 6 weeks and 12 months. First, there exists substantive differences in dietary patterns for those 55 included in our analysis. The majority of infants at 6 weeks (62%) were exclusively breastfed, while that 56 number is markedly less (35%) at 12 months, at which time infants are also consuming complimentary solid 57 family foods. This transition in diet to solid foods have been shown to induce a change in the gut microbiome 58 composition and diversity due to increased amounts of fiber and protein [54, 55], which might favor certain 59 microbes over others. Such changes in diet, particularly the cessation of breastmilk intake, also contributed 60 towards the development of infant gut microbiomes towards a more "adult like" state [27, 54]. We hypothesized 61 that earlier in life when infants are only consuming a limited type of food (predominantly breast milk or formula), 62 the microbiome participates more actively in host-microbiome co-metabolic activity as infants are more reliant 63 on microbes to breakdown complex nutrients [56]. Conversely, at one year of age where the microbiome has 64 matured, this relationship is not as strongly coupled as a larger share of the metabolome comes from host-65 produced metabolites. 66 However, as analyses were conducted within each timepoint independently with little subject overlap, further 67 investigations are required to make more conclusive statements about the potential time-varying effect of 68 microbiome-metabolome coupling. Particularly, aside from differences in diet, factors such as differences in 69 antibiotic exposure [57] and maternal covariates [58] might result in differences between time points. In future 70 studies we hope to examine this factor using samples across multiple time points for the same infants. 71

Limitations 72
This study has various limitations. First, we utilized partial 16S rRNA gene sequencing instead of shotgun 73 whole genome sequencing, which removes our ability to incorporate the functional metagenome into the 74 microbiome/metabolome interface, as well as limiting our taxonomic resolution to the genus level for most of 75 the analysis [59]. We hypothesized this lack of resolution contribute to overall lack of predictability, as well as

91
In conclusion, we conducted one of the first large-scale multi-omics analysis of the microbiome-metabolome 92 relationship using samples from a large birth cohort study at 2 time points (6 weeks and 12 months). Although 93 we found global concordance between the microbiome and the metabolome, the inter-omic concordance is 94 weak, where bacterial abundances cannot predict metabolite concentrations. In addition to the involvement of 95 host-produced metabolites as part of the metabolome, we suggest the role of functional redundancy as a 96 possible mechanism behind the lack of coupling identified. Additionally, we were able to identify metabolites 97 and microbes driving the overall correlation. Results pointed to support the importance of SCFA metabolism 98 particularly at 6 weeks, as well as the role of amino acid metabolism, either as a source of SCFA and energy in 99 the absence of carbohydrates, or as a general mechanism for microbes to save energy as they incorporate 00 amino acids around their environment. 01 23 We conclude that although the metabolome is a functional output of the microbiome, there exists massive 02 challenges in being able to trace specific microbial contributions to host-microbe metabolism due to the 03 complexity of factors such as functional redundancy. As such, we recommend studies to profile both the 04 microbiome and the metabolome, as aspects of microbial metabolic contributions cannot be found solely 05 through one omic data set. This is particularly important in settings where it is important to have a mechanistic 06 understanding of the role of microbes such as developing of microbiome therapies [61]. 07

Study population 09
Subjects for this study were from the New Hampshire Birth Cohort Study (NHBCS) who provided infant stool 10 samples at 6-weeks and 12-months after birth. As described in previous studies [25, 58], NHBCS is a 11 prospective study of mother-infant dyads in New Hampshire, USA. Participants eligible are pregnant women 12 between the ages of 18 and 45 years old, currently receiving routine prenatal care at one of the study clinics, 13 consuming water out of a private well with no intention to move prior to delivery. The Center for the Protection 14 of Human Subjects at Dartmouth provided institutional review board approval. All methods were carried out in 15 accordance with guidelines. Written informed consent was obtained for participation from all subjects for 16 themselves and their children. Comprehensive sociodemographic, exposure and outcome data such as infant 17 feeding method, delivery mode, maternal smoking status, etc. were collected for all participants through 18 surveys, medical records and telephone interviews conducted during pregnancy, about 6 weeks postpartum, 19 and updated every 4 months up until first year of age and every 6 months thereafter. 20

Collection of infant stool samples 21
Infant stool samples were collected at 6-weeks and 12-months. Stool samples were provided in diapers and 22 stored by subjects in their home freezer (-20°C) until they were able to return it to the study site. Stool was 23 thawed at 4 °C so that it could be aliquoted into cryotubes. Stools collected for 16S rRNA gene sequencing 24 were aliquoted (range 350-850 mg) into 3ml RNAlater and homogenized before storing at −80 °C. Stools 25 collected for metabolomic analysis were aliquoted (1-2 grams) into 15ml centrifuge tubes before storing at −80 26

°C. 27
Taxonomic profiling using 16S rRNA targeted gene sequencing 28 RNAlater stool samples were thawed and DNA was extracted using the Zymo Fecal DNA extraction kit (Cat# 29 integrals of each bin were normalized to the total spectral intensity of each spectrum and transferred to 78 analysis software. This resulted in a collection of spectral bins with bin-specific relative abundances, which will 79 be called the untargeted data. In addition, relative concentration of library-matched metabolites (selected from 80 the literature implicated to be important in host-microbe relationships - Table S1) was determined by using 81 Chenomx NMR Suite 8.4 Professional software [26].This data set will be called the targeted data set. 82

Software and tools 83
All analyses were performed using the R programming language (v. 3.6.3) [65] and associated packages. All 84 data wrangling steps were performed using phyloseq [69] [75] packages. Additionally, the tidymodels [76] suite of packages was 87 utilized to assist in all modelling tasks. Specific packages used for modelling will be enumerated below. All 88 scripts as well as intermediary analysis objects are available on github with all dependencies and their versions 89 (https://github.com/qpmnguyen/infant_metabolome_microbiome). 90

Data transformation and normalization 91
For microbiome data, we retained all ASVs present in at least 10% of samples [29] and added one 92 pseudocount to all cells [77]. We then subsequently aggregated all ASVs to the genus taxonomic level [28] and 93 converted data to relative proportions using total read counts by sample to account for differential sequencing 94 depth. We further filtered out taxa with mean relative proportion < 0.005% [78]. To address the compositional 95 problem induced by a sum to one constraint, we apply the centered log ratio transformation (CLR), which is 96 often used to remove such constraints in microbiome data sets [79]. For metabolomic data sets, we employed different transformations to approximate homoscedasticity depending 05 on the data type (targeted vs untargeted). For targeted metabolites, we performed a ( + 1) transformation 06 while for untargeted metabolites we utilized the arcsine square root transformation which has been previously 07 used for transforming composition metabolomics data sets [29]. 08

Distance matrix analyses 09
Principal coordinates analysis (PCoA) was performed using the pcoa function from the ape package in R [81] 10 with sample distance matrices. The PCoA procedure seeks to represent high dimensional multivariate data 11 sets in lower dimensions through eigen decomposition of the doubly centered distance matrix. PCoA allows the 12 usage of non-Euclidean distances between samples such as ecological indices, which makes it a preferable 13 method for sample ordination compared to principal component analysis (PCA). 14 We constructed Euclidean distance matrices for both metabolic and taxonomic profiles post data using the AlignSeqs function from the DECIPHER package in R [65] and trees were midpoint rooted using 20 phytools [66]. Since multiple sequence alignment is not conserved under filtering and aggregation of ASVs, 21 gUniFrac distance calculations were performed with pre-filtered ASV-level abundances normalized to relative 22 abundances. 23 The first two axes of constructed ordinations were then compared using a symmetric Procrustes procedure 24 implemented in the protest function in the vegan package [67]. Procrustes superimposes two ordinations by 25 translating and rotating the coordinates, which preserves the general structure of the data. This method 26 performs a superimposition fit between two data sets minimizing the sum-of-squared differences (m 2 ), which 27 describes the degree of concordance between the two configurations normalized to unit variance. Significance 28 is obtained by testing against the permuted null using a permutation test. This method was shown to have 29 more power while also limiting type I error compared to the traditional Mantel test in ecological analysis [82]. 30 Significance was determined using a permutation test on the sum of squared differences with 999 31 permutations [68]. 32

Sparse canonical correlation and Spearman correlation analyses 33
Sparse canonical correlation analysis (sCCA) was performed to identify strongly associated metabolite-34 microbe groups. sCCA seeks to find linear combinations of variables from each dataset that maximizes the 35 correlation with each other while simultaneously thresholding variable specific weights to induce sparsity and 36 performing variable selection. The correlation coefficient in the first canonical variate quantifies the overall 37 degree of multivariate associations. As such, sCCA is a popular method in integrating multi-omics datasets 38 with the ability to select more biologically relevant sets of features compared to traditional ecological methods 39 such as co-inertia analysis [83]. In this study, we use the sCCA implementation in the package PMA in R [38] 40 which uses a novel penalized matrix decomposition procedure to achieve sparsity [84]. We tune 41 hyperparameters using a permutation approach in the CCA.permute function (nperms = 50) prior to fitting the 42 final model. We obtain the correlation coefficients as a measure of overall correlation between the two data 43 sets and calculated a bootstrapped 95% confidence interval (nboot = 5000) as well as performing a 44 permutation test (nperm = 1000) at the 0.05 significance level. In order to keep the structure of the data set 45 29 across different permutations, we use the function randomizeMatrix from the package picante in R [85] using 46 the richness null model, which randomizes community abundances within samples to maintain sample species 47 richness. 48 Pairwise Spearman correlations were also performed using the cor function in R. Hypothesis testing was done 49 using cor.test, with multiple hypothesis testing correction using the Benjamini-Hochberg procedure using 50 p.adjust. An FDR value of 0.05 is used as cutoff for significance pairwise correlations. Visualization was done 51 using pheatmap package in R. 52

Predictive modelling and evaluation 53
We choose candidate models based on previous research utilizing supervised learning with microbiome 54 associated prediction tasks [28,29,31] For the second simulation scenario, null models were assessed through a permutation procedure using the 07 picante package in R as described earlier. A total of 500 permutations was performed for each model. 08 To evaluate the predictive capacity of models for each simulation scenario, each data set was split into a train 09 and test set (80% train; 20% test). Within each training set, a 10-fold cross validation procedure was employed 10 to tune any hyperparameters. Similar evaluation metrics were assessed as described in the model fitting 11 section. 12  Table 1. Selected characteristics of subjects contributing samples at 6 weeks (n = 158) and at 12 months of 92 age (n = 282). 93 Euclidean distances of log-transformed metabolite profiles. There were significant associations between the 99 microbiome and the metabolome (both targeted and untargeted) when utilizing Euclidean distances, however 00 this association goes away when the gUniFrac distance was employed for the targeted metabolites only. 01  Bottom panels show aggregated model rankings for all metabolites using R-squared (left) and Spearman 20 correlation (right) using Borda scores (Methods). Higher scores indicate that a model was consistently selected 21 as a better performing. Relatively similar Borda scores and cross-metabolite average predictive performances 22 indicate that no model was clearly the most performant. However, support vector machines (with radial basis 23 function kernel) was highest scoring model. 24  Inter-omics Procrustes biplots comparing PCoA ordinations of targeted metabolite pro les and taxonomic relative abundances for 6 weeks (left panels) (n = 158) and 12 months (right panels) (n = 262). Top panels present analyses based on ordinations from Euclidean distances of genus level abundances after centered log ratio transformation and Euclidean distances of log-transformed metabolite pro les. Bottom panel presents analyses based on gUniFrac distance of amplicon sequence variant (ASV) relative abundances and Euclidean distances of log-transformed metabolite pro les. There were signi cant associations between the microbiome and the metabolome (both targeted and untargeted) when utilizing Euclidean distances, however this association goes away when the gUniFrac distance was employed for the targeted metabolites only.

Figure 3
Pairwise Spearman correlation of concentration-tted metabolites and genus-level taxonomic abundances for 6-weeks (panel A, N = 168) and 12-months (panel B, N = 282) infants. Left panel displays the overall correlation pattern, where non-signi cant correlations are not colored (false discovery rate (FDR) controlled q-value < 0.05). Right panel displays the same heatmap restricted to taxa and metabolites selected by the sparse CCA procedure. Additionally, correlation coe cient of the rst sCCA variate pair, bootstrapped 95% con dence interval and permutation p-value are also reported. Signi cant microbiome-metabolome correlation was observed at both time points, however no signi cant difference was found between the time points. Forest plots of each prediction performance metric (R-squared -Panel A, Spearman correlation -Panel B) for each time point (6 weeks (n = 158), 12 months (n = 282)) across all 36 metabolites and 4 machine learning models. 95% credible interval and predictive posterior means were generated using Bayesian modelling of the evaluation statistic (Methods) after 100 repeats of 5-fold nested cross validation. Red vertical lines indicate a value of 0 for the evaluation metric (equivalent to null model). Metabolites were classi ed as predictable if the null value did not lie within the estimated 95% credible interval. For most metabolites, predictive performance was not signi cantly better than null models  metabolites. Bottom panels show aggregated model rankings for all metabolites using R-squared (left) and Spearman correlation (right) using Borda scores (Methods). Higher scores indicate that a model was consistently selected as a better performing. Relatively similar Borda scores and cross-metabolite average predictive performances indicate that no model was clearly the most performant. However, support vector machines (with radial basis function kernel) was highest scoring model.