Children Developing Celiac Disease Have a Distinct Gut Microbiota Composition, IgA Response and Plasma Metabolome

Objective: Celiac Disease (CD) is a gluten induced immune-mediated disease. Almost all CD patients possess human leukocyte antigens (HLA) DQ2 or DQ8, however only small subset of individuals carrying these alleles develop CD. Despite tremendous efforts, the role of the environmental factors in CD pathogenesis remain unclear and genetics alone cannot explain the increasing incidence and prevalence rates. The main goal of this study was to determine the alterations in the gut microbiota composition, function, IgA response and plasma metabolome in the rst ve years of life and establish a microbial link to pediatric CD pathogenesis. Design: We conducted a longitudinal study focusing on three phases of the gut microbiota development at ages 1, 2.5 and 5. The fecal samples were obtained from 16 CD progressors (children who developed CD during or after the study) and 16 age, breastfeeding and HLA-matched healthy controls. We used 16S sequencing combined with functional analysis, ow cytometry, immunoglobulin A (IgA) sequencing (IgA-seq), and plasma metabolomics to determine the alterations. Results: We identied a distinct gut microbiota composition in CD progressors in each microbiota developmental phase. Pathogenesis and inammation-related microbial pathways were enriched in CD progressors. Moreover, CD progressors had signicantly more IgA coated bacteria and the bacterial targets of the IgA response were signicantly different in CD progressors. Proinammatory and pathogenesis-related bacterial metabolic pathways were enriched in children developing CD. Analyzing plasma metabolome, we identied 19 plasma metabolites that were signicantly different. Notably, inammatory metabolites, particularly microbiota-derived taurodeoxycholic acid (TDCA) increased and anti-inammatory metabolites oleic acid and oleamide decreased in CD progressors. Conclusion: Our study denes an inammatory gut microbiota for the children developing CD in the rst ve years of life and establish a link between gut microbiota composition and chronic inammation in pediatric CD. These inammatory features are related to gut microbiota composition, function, IgA response and related plasma metabolites.


Background
Celiac disease (CD) is a gluten-induced autoimmune disorder that is predicted to affect 1 in 100 individuals worldwide [1]. The adaptive autoimmune response in CD is characterized by gluten-speci c CD4+ T cell and antibodies against gluten gliadin peptide and the enzyme tissue transglutaminase (tTG) [2] responsible for deamidating the gliadin peptide [1]. Antigen-presenting cells (APCs) present gliadin peptides to T cells and cause mucosal lesions in the small intestine [3]. Almost all CD patients possess HLA-DQ2 or HLA-DQ8. Although 20%-40% of the population in Europe and the USA carries these alleles, only 1% of individuals develop the disease [4]. King et al. recently showed that the incidence of CD to be increasing by 7.5% per year in the last decades [5]. Furthermore, even among twins, the concordance of CD is not 100 % [6,7]. These ndings suggest that CD is a multifactorial disease.
CD has earlier been considered as an intestinal disease occurring only in children with a clinical presentation known as malabsorption syndrome and failure to thrive [8]. The incidence of childhood CD continues to increase in the US and Europe [9,10]. In a local example (Denver, Colorado), the prevalence of CD by age 15 was found to be 3.1%, which is two-fold higher than the estimation in the adult population [11]. Currently, the only way to treat CD is strict adherence to a gluten-free (GF) diet, but 20% of patients do not respond to GF diet and continue to have persistent or recurrent symptoms [12]. Moreover, the gluten withdrawal causes several problems including psychological problems, fear of involuntary/inadvertent contamination with gluten, vitamin and mineral de ciencies, metabolic syndrome, increased cardiovascular risk and severe constipation [13][14][15][16]. These CD related drawbacks greatly worsen the development of children with CD and their overall life quality. Thus, understanding the triggers of pediatric CD is an important task to develop alternative treatments to gluten free diet and new tools to prevent CD.
Various environmental factors are implicated in CD development [17], but the roles of these environmental factors in CD progression remain largely unknown. Gut microbiome studies observe an altered microbial [18] and metabolite composition [19,20] in both infant and adult CD patients but have not identi ed any causal link [21,22]. Immunoglobulin A (IgA) is the most abundant antibody isotype at mucosal surfaces and is a major mediator of intestinal immunity in humans [23]. IgA-sequencing (IgAseq) combines bacterial cell sorting with high-throughput sequencing to identify distinct subsets of highly IgA coated (IgA+) and non-coated microbiota (IgA-) [24][25][26]. It was previously shown in a mice model that IgA+ microbiota could induce more severe colitis than IgA-microbiota [27]. Similarly, IgA-seq identi ed Escherichia coli as an in ammatory bacterium enriched in Crohn's disease-patients with spondyloarthritis [28]. However, IgA-sequencing was not applied to CD samples until our study to identify potential pathogens and immunoregulatory commensals involved in CD onset. Therefore, we hypothesized that the children developing CD will have a more in ammatory gut environment and the IgA response would have speci c targets in CD progressors that differ from the control targets.
In this study, we assessed the composition and function of the gut microbiota in a retrospective, longitudinal cohort of 32 children matched for human leukocyte antigen (HLA) genotype and breastfeeding duration (n=16/group). We focused on fecal samples obtained at ages 1, 2.5 and 5 because these samples represent the three stages of gut microbiota development [29] in early childhood when most of these children were not diagnosed with CD (pre-CD stage; CD progressors). We then identi ed the functional pathways enriched in CD progressors, determined the targets of the IgA in the gut microbiota using IgA-seq. Lastly, we compared the plasma metabolome and identi ed signi cant differences. Our ndings demonstrate that children who go on to develop CD have signi cant alterations in their gut microbiome years before diagnosis. CD-associated gut microbiota are enriched in in ammatory-and pathogenicity-related bacteria, as well as microbial functions and metabolites that potentially contribute to chronic in ammation in CD.

Human Fecal Samples
The fecal samples were obtained from subjects in the All Babies in Southeast Sweden (ABIS) cohort. ABIS study was ethically approved by the Research Ethics Committees of the Faculty of Health Science at Linköping University, Sweden (Ref. 1997/96287 and 2003/03-092) and the Medical Faculty of Lund University, Sweden (Dnr 99227, Dnr 99321). All children born in southeast Sweden between 1 st October 1997 and 1 st October 1999 were recruited. Informed consent from the parents was obtained. Fresh fecal samples were collected either at home or at the clinic. Samples collected at home were stored at -20 °C with freeze clamps, mailed to the WellBaby Clinic and stored dry at −80 °C. The questionnaire was completed by the parents to collect participants' health information including, but not limited to, breast feeding duration, antibiotic use, gluten exposure time, and more. In total 68 fecal samples were collected for the analysis (10 at age 1, 32 at age 2.5 and 26 at age 5). The fact that only 10 fecal samples were collected for the 32 subjects at age 1 is due to ABIS parents evidently found it was more di cult to collect stool samples for age 1, thus most of them decided to only collect fecal samples when their children at ages 2.5 and 5 year.

IgA+ and IgA-Bacteria Separation
IgA+ and IgA-bacteria separation was performed as previously described [27]. Brie y, frozen human fecal samples were placed in Fast Prep Lysing Matrix D with ceramic beads (MP Biomedicals) and incubated in 1ml Phosphate Buffered Saline (PBS) per 100mg samples on ice for 5 min for hydrating. This was followed by homogenization using bead beating for 7 s (Minibeadbeater; Biospec) and then centrifuged 50g for 10 min at 4°C to remove large plaques. Fecal bacteria in the supernatants were collected (200 μl/sample) and washed three times with 500 μl PBS containing 1% (w/v) Bovine Serum Albumin (BSA, American Bioanalytical) and centrifuged for 5 min (6,000 x rpm, 4°C). A sample of this washed bacterial suspension (50 μl) was collected as the pre-sorting sample for 16S sequencing analysis. After washing, bacterial pellets were re-suspended in 50 μl blocking buffer (PBS containing 1% (w/v) BSA and 20% Normal Mouse Serum (Jackson ImmunoResearch), incubated for 20 min on ice, and stained with 100 μl PE-conjugated mouse anti-human IgA (1:40; Miltenyi Biotec clone IS11-8E10) for 30 minutes on ice. Samples were subsequently washed 3 times with 500 μl BSA containing 1% (w/v) before ow cytometry analysis or cell separation. PE anti-human IgA stained bacteria were incubated with Anti-PE Magnetic Activated Cell Sorting (MACS) beads (Miltenyi Biotec) (1:5) for 30 minutes on ice and then separated by a custom magnetic plate for 10 minutes on ice. Fecal bacteria bound to the magnetic plate were collected as IgA+ samples for 16S sequencing analysis. Stained and MACS bead-bound bacteria unbound to magnet plate were collected (20~40 μl) and passed through MACS molecular columns (Miltenyi Biotec) (one sample/column) followed by ushing with 480 μl PBS containing 1% (w/v) BSA. The total pass-through (~500 μl) was loaded onto columns one more time. The columns were ushed with 500 μl PBS containing 1% (w/v) BSA. The total column pass-through (~1 ml) was saved as IgAsamples for 16S sequencing analysis.
Fecal IgA Flow Cytometry Analysis.
Bacterial cells were isolated from fecal samples as described in IgA+ and IgA-Bacteria Separation method section of this manuscript. Bacteria were stained with PE Anti-human IgA antibodies (1:100; Miltenyi Biotec clone IS11-8E10) for 30 min on ice. After washing twice, bacteria were stained with TO-PRO®-3 (ThermoFisher Scienti c) to identify bacteria from fecal debris or particles. Stained bacteria were analyzed by a BD FACSAria TM IIIu cell sorter (Becton-Dickinson) as previously described [24] as TO-PRO®-3 + IgA +/cells.
16S rRNA Gene Sequencing 16S rRNA sequencing of the V4 region sequencing for all bacteria samples were performed on the Miseq platform with barcoded primers. Brie y, all bacterial samples were suspended in 90 μl of MicroBead Lysis Solution with 10% RNAse-A and sonicated in a water bath at 50°C for 5 minutes. Samples were transferred to a plate containing 50 μl of Lysing Matrix B (MP Biomedicals) and homogenized by beadbeating for 5 minutes. After centrifugation (4122 x g, 4°C) for 6 minutes, the supernatant was transferred to 2 ml deep-well plates (Axygen Scienti c). Bacterial DNA of the samples were extracted and puri ed using MagAttract Microbial kit (QIAGEN) following instruction provided by the manufacturer. PCR was performed to amplify the V4 region of 16S ribosomal RNA (33 cycles) in duplicate (3 μl puri ed DNA per reaction; Phusion DNA polymerase, New England Bioscience) [27]. After ampli cation, PCR products were then normalized with SequalPrep TM normalization plate kit (ThermoFisher Scienti c) and pooled. The pooled library concentration was calculated by using NGS Library Quanti cation Complete kit (Roche 07960204001) and then loaded on a Miseq sequencer. Illumina Miseq Reagent Kit V2 (500 cycles) was used to generate 2x250bp paired-end reads. The raw reads were demultiplexed in Qiime1 (version 1.9). The sequencing yielded with a mean of 30,471 reads per sample.

Bioinformatics Analysis and Statistics
Microbial diversity and statistical analyses were performed by ltering and trimming of the bacterial 16s rRNA amplicon sequencing reads, and sample inference that turns amplicon sequences into an Operational Taxonomic Units (OTUs) table were performed by dada2 [30] using the Ribosomal Database Project Training Set 16 [31]. Exploratory and inferential analyses were performed by using phyloseq [32] and vegan [33], which includes non-metric multidimensional scaling (NMDS) analysis using Bray-Curtis dissimilarity, Principle Components Analysis (PCA), alpha and beta diversity estimates, and taxa agglomeration. Differential OTU abundance was assessed per time point by edgeR [34] with two-sided empirical Bayes quasi-likelihood F-tests . P-values were corrected by using the Benjamini-Hochberg false discovery rate (FDR), and FDR < 0.25 was considered statistically signi cant [35]. The prediction of gene content and pathway abundance were performed using the Kyoto Encyclopedia of Genes and Genomes (KEGG) database and PICRUSt2 [36][37][38]. Differential KEGG pathway abundance was assessed by using limma [39]. The bar plots and box plots were made using ggplot2 [40], and heatmap by pheatmap [41].

Plasma Metabolomics Analysis
Plasma samples for metabolomics analysis were prepared as previously described [42,43]. Metabolite extraction from plasma was achieved using a mixture of isopropanol, acetonitrile, and water at a ratio of 3:3:2 v/v. Extracts were divided into three parts: 75 ul for gas chromatography combined with time-ofight high-resolution mass spectrometry, 150 ul for reverse-phase liquid chromatography coupled with high-resolution mass spectrometry, and 150 ul for hydrophilic interaction chromatography with liquid chromatography and tandem mass-spectrometry, and analyzed as previously described [42,43]. We used the NEXERA XR UPLC system (Shimadzu, Columbia, MD, USA) coupled with the Triple Quad 5500 System (AB Sciex, Framingham, MA, USA) to perform hydrophilic interaction liquid chromatography analysis, the NEXERA XR UPLC system (Shimadzu, Columbia, MD, USA) coupled with the Triple TOF 6500 System (AB Sciex, Framingham, MA, USA) to perform reverse-phase liquid chromatography analysis, and an Agilent 7890B gas chromatograph (Agilent, Palo Alto, CA, USA) interfaced to a Time-of-Flight Pegasus HT Mass Spectrometer (Leco, St. Joseph, MI, USA). The GC system was tted with a Gerstel temperatureprogrammed injector cooled injection system (model CIS 4). An automated liner exchange (ALEX) (Gerstel, Muhlheim an der Ruhr, Germany) was used to eliminate cross-contamination from the sample matrix that was occurring between sample runs. Quality control was performed using metabolite standards, mixture and pooled samples. A standard quality control sample containing a mixture of amino and organic acids was injected daily to monitor mass spectrometer response. A pooled quality control sample was obtained by taking an aliquot of the same volume from all samples of the study and injecting daily with a batch of analyzed samples to determine the optimal dilution of the batch samples and validate metabolite identi cation and peak integration. Collected raw data were manually inspected, merged, inputted and normalized by the sample median.
Metabolite pathway analysis.
Metabolomic data was analyzed as previously described by Tolstikov et al. [44]. Identi ed metabolites were subjected to pathway analysis with MetaboAnalyst 4.0, using a Metabolite Set Enrichment Analysis (MSEA) module which consists of an enrichment analysis relying on measured levels of metabolites and pathway topology and provides visualization of the identi ed metabolic pathways. Accession numbers of detected metabolites (HMDB, PubChem, and KEGG Identi ers) were generated, manually inspected, and utilized to map the canonical pathways. MSEA was used to interrogate functional relation, which describes the correlation between compound concentration pro les and clinical outcomes.

Study Cohort
All Babies in Southeast Sweden (ABIS) is a prospective population-based study that established a large biobank of biological specimens obtained longitudinally at birth and ages 1, 2.5, and 5. In total, 17 000 children (78.6%) out of 21700 born in southeast Sweden Oct 1st 1997-1999 were included after their parents had given their informed consent. Among this large cohort, 230 (1.35%) children were diagnosed with celiac disease by the end of 2017, validated from the Swedish National Diagnosis Register (SNDR).
CD diagnosis was made based on international classi cation of disease (ICD) codes-10 K90.0 according to the SNDR and met the ESPGHAN CD criteria but detailed information on height of TTG or any biopsy data with Mash stage was not available. To determine the role of gut microbiota in CD pathogenesis, we used ABIS samples selecting a sub cohort of 32 individuals. We chose 16 subjects who developed CD but were not diagnosed with any other autoimmune disease as of December 2017.
The selected 16 subjects were a subset of children with CD whose fecal samples were available at least at two time points at ages 1, 2.5, or 5. We matched this group with 16 healthy controls based on the HLA distribution and breastfeeding duration (See Additional le 1: Table S1). The diagnosis of CD for 11 individuals occurred after age 5, one subject was diagnosed at age 1.8 while the other four were diagnosed between the ages of 2.5 and 5. We studied 68 longitudinal stool samples in total (See Additional le 1: Table S1). Although we did not match subjects for other parameters, timing of gluten exposure, delivery method, breastfeeding duration, family history of CD, infections and use of antibiotics were comparable between groups (See Additional le 1: Table S1).

Celiac Disease Progressors Have a Distinct Gut Microbiota Composition
By sequencing the V4 region of the 16S rRNA gene [27], we identi ed 661 operational taxonomic units (OTUs) (See Additional le 2: Table S2). Consistent with previous studies [45], gut microbiome alpha diversity increased until age 2.5 and remained stable up to age 5 in both groups. Although it did not reach statistical signi cance, alpha diversity was slightly higher (observed OTU: unpaired t test p=0.0887; Simpson index: unpaired t test p=0.0787) for the CD progressors at age 1 ( Figure 1A). Beta diversity was comparable between CD and healthy controls in each phase ( Figure 1A). Non-metric multidimensional scaling (NMDS) plots showed a trend of separation of gut microbiome composition between CD and healthy control individuals at ages 1 and 2.5 ( Figure 1B). Relative abundance analyses of microbial taxa whose relative abundance was more than 1% revealed that CD progressors had higher levels of Firmicutes than controls (mean average abundance (MAA): CD=0.698, Ctrl=0.427; False Discovery Rate (FDR) =0.143) at age 1 ( Figure 1C). These differences dissipated over time. Among the genera whose relative abundance were more than 1%, no signi cant differences were observed (See Additional le 3: Table S3). Signi cance was determined by FDR<0.25.

CD progressors Have More Bacteria Coated with IgA Indicating an In ammatory Gut Microbiota Composition
To test our initial IgA hypothesis, we used a modi ed method of IgA-sequencing (See Additional le 5: Figure S2A). PCA (Principle component analysis) result showed a clear separation between IgA+ and IgAbacteria at all ages both in control and CD samples (Figure 2A). We also identi ed an overall separation for all samples (See Additional le 5: Figure S2B). We con rmed this nding using ow cytometry (See Additional le 5: Figure S2C). The ow cytometry analysis revealed that the abundance of IgA+ bacteria was increased from a least squares (LS) mean of 4.57% at age 1 to 8.88% at age 2.5 and maintained at 6.08% at age 5 in controls. On the other hand, IgA+ bacteria were with a LS mean of 10.99% at age 1 in the CD progressors, indicating a two-fold increase compared to controls (p=0.24). It was 8.34 % at age 2.5, comparable to control samples (8.88%, p=0.99). At age 5, there was a two-fold increase compared to the controls (6.03%) and 12.8 % of the bacteria was IgA+ in CD progressors (p=0.026, Figure 2B), revealing that CD progressors have more IgA+ bacteria in their gut compared to healthy controls especially at age 5. To con rm that this result is not affected by ve CD progressors who developed CD before age 5, we repeated the same analysis by removing them and obtained the similar signi cant result at age 5 (CD: 12.8%, Control: 6.02%, p=0.027; See Additional le 5 Figure S2D). This result indicates a more pathogenic gut microbiota composition and a more in ammatory environment for CD progressors. Moreover, we also showed that only a small fraction of the microbiota are coated by IgA during child gut microbiota development (~ 5-8 %) in healthy controls and it is increased in children developing CD (8.5-12 %).
A Speci c IgA Response to Bacteria Develops After Age 1 Because there are very few reports on the IgA response in early human gut microbiota development [26], we rst focused on the results obtained from the healthy children. We did not observe any difference between IgA+ and IgA-samples in the control group at age 1 (See Additional le 6: Figure S3A). This result suggests that the IgA response does not target speci c bacteria in this early stage of development.

IgA Response Targets Are Comprised of Different Bacteria in CD progressors
Consistent with the presorting data, the alpha diversity increased at age 2.5 and remained stable in both control and CD IgA-groups ( Figure 2C). Likewise, beta diversity was comparable between CD IgA-and control IgA-samples ( Figure 2C). There was a separation of IgA-gut microbiota composition between CD and healthy control individuals at ages 1 and 2.5 ( Figure 2D). No difference was observed on the relative abundance of IgA+ or IgA-microbiotas at ages 1 or 2.5 between CD progressors and healthy control at phylum level. Likewise, the relative abundances of the phyla did not show any difference for IgAmicrobiota or presorting analysis at age 5. On the other hand, the relative abundance of Firmicutes was higher (MAA: CD=0.528, Ctrl=0.408; FDR=0.0896) in CD progressors' IgA+ microbiota compared to healthy controls and Actinobacteria was higher in healthy subjects' IgA+ microbiota (MAA: CD=0.078, Ctrl=0.149; FDR=0.0896). No difference was observed at genus level ( Figure 2E). In addition to the differences caused by altered gut microbiota composition, we also identi ed 72 OTUs at age 2.5 and 45 OTUs at age 5 in which abundances were the same in the gut microbiota of CD and healthy samples (presorting) but differentially targeted by the immune system (See Additional le 7: Table S4). For example, Lachnospiraceae (FC=113, FDR=7.68e-4), Bacterioides negoldii (FC=382, FDR=3.64e-5) (See Additional le 6: Figure S3C, 1E), and Bacteroides vulgatus (FC=42.7, FDR=2.893e-3) OTUs at age 2.5 and Enterobacter (FC=468, FDR=7.18e-3), Blautia (FC=168, FDR=7.18e-3) (See Additional le 6: Figure S3C), Lactococcus (FC=57.3, FDR=1.37e-2), and Clostridium sensu stricto (FC=24.9, FDR=0.0425) OTUs at age 5 were highly coated with IgA in CD groups but not in controls. Overall, these results indicate that not only gut microbiota composition, but also the IgA response to microbiota, is altered in CD progressors.

Plasma Metabolomics Analysis Reveals an In ammatory Metabolic Pro le for CD Progressors
In order to determine the early markers of CD progression in the plasma metabolome and its link to gut microbiota, we applied a targeted plasma metabolomics analysis. We used 10 CD and 9 control plasma samples obtained at age 5. Three subjects in the CD group were diagnosed before age 5. In total, we identi ed 386 metabolites. Partial least squares-discriminant analysis (PLS-DA) showed a clear separation of the plasma metabolites between CD-progressors and healthy control groups ( Figure 4A). Volcano plots show the most signi cantly altered metabolites ( Figure 4B, See Additional le 13: Table   S7). We identi ed a clear separation between these groups and 19 out of 387 metabolites were signi cantly different (p < 0.05, See Additional le 13: Table S7) between the two groups. The top three most altered metabolites (p < 0.01) were TDCA (FC=2, P=0.008), Glucono-D-lactone (FC=1.47, P=0.009) and Isobutyryl-L-carnitine (FC=2.058, P=0.009). All three top metabolites were increased in CD samples ( Figure 4C). The most altered metabolite, TDCA, is a conjugated bile acid that was shown to be proin ammatory [51]. TDCA is mainly produced by gut microbes, particularly by Clostridium XIVa and Clostridium XI, with 7-α-dehydroxylation of taurocholic acid and cholic acid [52]. This result is consistent with our microbiota analysis since we identi ed several Clostridium XIVa OTUs that were signi cantly more abundant in CD samples, especially at age 5 (FC=40.3, FDR=0.0139, P<0.001). These Clostridium XIVa OTUs were highly targeted by IgA in CD progressors (IgA+ vs. IgA-: FDR=0.023) ( Figure 4D). In addition, the levels of two anti-in ammatory metabolites, oleic acid (FC=0.69, P=0.027) and its derivative oleamide (FC=0.73, P=0.029) were signi cantly decreased in CD progressors. The most changed metabolite in terms of fold change was 2-Methyl-3-ketovaleric acid, and it was increased more than eight folds in CD progressors (FC=8.6, P=0.048) ( Figure 4C). The heat map shows 50 of the most altered metabolites and indicates a strong signature in the plasma metabolome in which 19 metabolites were signi cantly altered (p<0.05) ( Figure 4E). We used Pathway Analysis to determine the functions related to these metabolites ( Figure 4F, See Additional le 10: Table S5). Indeed, the pentose phosphate pathway (PPP) (Raw P=0.0246), lysine degradation (Raw P=0.028), and glycolipid metabolism (Raw P=0.0397) were the most signi cantly altered pathways.

Discussion
Recent studies have demonstrated strong associations between the gut microbiota and the pathogenesis of autoimmune diseases [53]. Studies of the gut microbiome in CD have demonstrated intestinal dysbiosis in CD patients [18,22,54,55]. However, the majority of these studies were performed using adult samples with diagnosed disease and none of them used a longitudinal approach as this study. Human gut microbiota development is divided into three phases: a developmental phase (months 3-14), a transitional phase (months [15][16][17][18][19][20][21][22][23][24][25][26][27][28][29][30], and a stable phase (months 31-46) [29]. Because recent data show that most childhood CD cases will develop by age 5 years [56], we analyzed samples representing all of these critical phases. We report, for what we believe is the rst time, that there are signi cant differences in microbiome composition and function at each developmental phase in CD progressors.
Consistent with previous reports [55], we identi ed the proportion of phylum Firmicutes higher in CD progressors at age 1 ( Figure 1C). Bacterial proteases of species mostly classi ed within the Firmicutes phylum are involved in gluten metabolism and this might be a link to CD [57]. Additionally, we identi ed bacterial species highly enriched in CD progressors at age 1 including Ruminococcus bromii, Bi dobacterium dentium, and Clostridium XIVa sciendens. A previous study reported that the abundance of R. bromii was greatly reduced in CD patients when gluten free (GF) diet was introduced [58]. Consistent with our ndings, other studies have shown that the abundance of B. dentium was increased in the CD patients [59]. Clostridium XIVa genus is responsible for producing the proin ammatory metabolite TDCA. As subjects aged, the differences in the gut microbiota was decreased at the phylum level but signi cantly increased in the OTU level, which is more informative about massive alterations in the microbiota of CD progressors.
Intestinal IgA plays a crucial role in defending against pathogenic microorganisms and in maintaining gut microbiome homeostasis. Interestingly, IgA-de cient patients are more susceptible to variety of pathologies, including CD [60]. Planer et al described mucosal IgA responses progression during two postnatal years in healthy US twins [26]. They showed that (i) IgA coated bacteria are affected by age and host genetics and (ii) IgA response is determined by "intrinsic" properties of gut microbiota community members. We used a similar approach to investigate the gut immune development towards healthy and CD states; we initially focused on the development of the IgA response during the gut microbiota maturation. At age 1, we did not identify any OTUs that were signi cantly different in IgA-and IgA+ samples, suggesting that the intestinal IgA response is not mature enough to target speci c bacteria in the gut. However, we identi ed one OTU in the pathogenic Brucella genus that was highly coated with IgA in CD progressors. Further studies using distinct cohorts will be needed to verify this result, but Brucella species are well characterized pathogens [61] and might be related to increased in ammation and IgA responses. Flow cytometry analyses showed that the IgA response is highly selective and only a small fraction of the gut microbiota is highly coated with IgA in the rst ve years of life. More importantly, the percentage of IgA+ bacteria were higher in CD progressors compared with healthy controls at age 5. While a reduction of secretory IgA (sIgA) using infant (4-6 months) fecal samples in CD progressors [21] was reported previously, we did not observe such a defect in our cohort, in contrast, we identi ed a two-fold increase in the number of IgA coated bacteria in CD progressors especially at age 5. The increased IgA response in the gut might be related with the increased in ammation caused by increased number of in ammatory bacteria.
The abundance analysis (See Additional le 3: Table S3, Fig 2E) showed that CD progressors' IgA+ microbiota is enriched with Firmicutes while healthy controls' IgA+ microbiota is enriched with Actinobacteria. Consistent with this nding, previous studies reported higher proportions of Firmicutes and lower proportions of Actinobacteria in HLA high-risk infants [20,62]. Notably, most of the bacterial strains in the human gut microbiota that can metabolize gluten are classi ed with the phyla Firmicutes and Actinobacteria [57]. Bacterial metabolism of gluten might result in different gluten peptide fragments, potentially related to CD autoimmunity. This might be one of the reasons why IgA response speci cally targets Firmicutes in CD progressors at age 5. On the other hand, some members of the Actinobacteria, such as Bi dobacterium species, can stimulate the production of IgA in the intestines [63,64]. Bi dobacterium species are known to regulate immune response by inducing dendritic cells and Treg cells with regulatory activities [65]. Treg cells play an important role in maintaining tolerance to food antigens and microbiota to suppressing autoimmunity and in ammation in the gut [66,67]. Thus, enrichment of the Actinobacteria in healthy IgA+ microbiota is potentially related with regulatory functions.
Our analysis revealed 144 OTUs at age 2.5 and 167 OTUs at age 5 years old that were highly IgA coated in the CD progressors. Among them Coprococcus comes, Bacteroides negoldii at age 2.5 and Faecalibacterium prausnitzii and Clostridium_XlVa at age 5 were the main targets of the mucosal immune response in CD progressors. Among these bacteria, Coprococcus comes has been recently identi ed as the main IgA target in the human colon [68].
Notably, we identi ed 72 OTUs at age 2.5 and 45 OTUs at age 5 that are equally abundant in CD progressors and healthy controls, but selectively targeted by IgA in CD progressors. Some of these OTUs are at the species level. For example, Bacterioides negoldii, and Bacteroides vulgatus at age 2.5 and Peptostreptococcus stomatis at age 5 were selectively targeted by IgA in CD progressors but not in controls. B. vulgatus was implicated in the development of gut in ammation and a previous report identi ed this bacterium as enriched in older children with CD [69,70]. In agreement with our results, a pathogenic role for Bacteriodes species is found to be related to the loss of integrity of the intestinal epithelial barrier [70].
PICRUSt analyses showed signi cant differences at all developmental phases within the transition period at age 1. Most signi cant differences were identi ed in pathways related to bacterial pathogenesis and shaping the composition of microbiota. For example, glutathione metabolism was greatly decreased in CD progressors. Decreased glutathione redox cycle in CD patients is strongly associated with disease development [71]. At age 5, PICRUSt predicted that retinol metabolism, steroid hormone biosynthesis, and glycosaminoglycan degradation as over-represented pathways in CD progressors. Retinoic acid is one of the products of retinol metabolism and plays a key role in the intestinal immune response [72]. A previous study showed that retinoic acid mediated in ammatory responses to gluten in CD patients [73]. The increased retinol metabolism and glycosaminoglycan degradation pathways in CD progressors are potentially related to chronic in ammation. Indeed, glycosaminoglycan help to form a protective barrier for the intestinal mucin. The breakdown of glycosaminoglycan is reported to be associated with in ammatory response in intestinal disorders such as IBD [74]. These results suggest that during the developmental phase, the gut microbiota functions in CD progressors were related to shaping the gut microbiota composition. Entering the transition phase, the gut microbiota in CD progressors displayed more proin ammatory and oxidative stress related features. At stable phases, gut microbiota in CD progressors begin to become more involved in functions related to the clinical manifestation of the disease. This longitudinal observation provides insight into the proin ammatory and pathogenic function of gut microbiota in different stages of early CD pathogenesis.
Although the hallmark of the CD is intestinal in ammation, the disease affects different tissues. To determine the systemic effects of gut microbiota on different organs, we performed a comparative plasma metabolomics analysis at age 5. In agreement with the gut microbiota analysis, plasma metabolites were signi cantly altered prior to diagnosis in CD progressors. The most signi cantly altered plasma metabolites were TDCA and Isobutyryl-L-carnitine ( Figure 4C) in which both were increased twofold in CD progressors. TDCA is a conjugated bile acid that is shown to be proin ammatory [51] and is mainly produced by gut bacteria, particularly by Clostridium XIVa and Clostridium XI [52]. This observation suggests that the plasma TDCA detected in our study is secondary to the increased abundance of some Clostridium XIVa species in CD progressors at age 5 ( Fig 4D). Likewise, Isobutyry-l-carnitine is a member of acylcarnitines. As the byproduct of incomplete beta oxidation, the increased isobutyry-l-carnitine is related to abnormalities in fatty acid metabolism [75] and activates proin ammatory signaling [76].
Pathway analysis for plasma metabolites identi ed several pathways including the pentose phosphate pathway (PPP), lysine degradation, and glycerolipid metabolism. PPP was identi ed as the most signi cantly altered pathway and stimulates formation of NADPH as an antioxidant, thereby controlling cell in ammation. Consistently, fatty acid oleic acid [77] and its amide derivative oleamide [78] decreased in CD progressors ( Figure 4C). Oleic acid was reported to play an anti-in ammatory role via inhibiting proin ammatory lymphocyte cell proliferation [79,80]. As the major component of olive oil, oleic acid showed bene cial anti-in ammatory effects in another autoimmune disease, rheumatoid arthritis [81]. Further, its derivative oleamide was reported to suppress LPS induced in ammation in vitro [82] and in vivo [78]. The decrease of oleic acid and oleamide in CD progressors is potentially linked to increased in ammation in CD progressors. Thus, plasma metabolites of CD progressors are a component of the in ammatory response.
Currently, the only way to treat CD is strict adherence to a gluten-free (GF) diet, but 20% of patients do not respond to GF diet and continue to have persistent or recurrent symptoms [12]. CD permanently reshapes intestinal immunity, and alterations of TCRγδ+ intraepithelial lymphocytes in particular may underlie nonresponsiveness to the GF diet [83]. Our ndings suggest that the in ammatory nature of the CD progressors' gut microbiota is a key component of intestinal in ammation in CD. The proin ammatory factors identi ed in this study potentially trigger local and systemic in ammation independent of the diet and may explain a failure to respond to GF diet in some patients.
The main strength of this study lies in the longitudinal sampling that represents all three phases of gut microbiota development in children. Further, applying IgA-seq analysis, for what we believe is the rst time, provides an important dimension to evaluate CD pathogenesis. The main limitation of this study is the small sample size, particularly at age 1(n=5). Follow up studies with large sample size and deep shotgun sequencing will further increase the statistical power and potentially strengthen our ndings.

Conclusions
Taken together, our ndings suggest that the gut microbiota of CD progressors in the rst ve years of life has profound effects on the in ammatory response and can potentially contribute to the onset and progression of CD. Moreover, we established a link between gut microbiota composition and chronic in ammation in CD during child development. The highly IgA-coated bacteria identi ed in this study potentially contribute to CD pathogenesis. Targeting these bacteria in the early stages of CD development could be a preventative tool. Likewise developing anti-in ammatory probiotics/prebiotics might be viable therapeutics for altering microbiota composition in children genetically predisposed for CD. These microbes/compounds may also complement a gluten-free diet in patients that continue to experience persistent CD symptoms. Lastly, the early plasma markers including, TDCA, oleic acid, and oleamide have the potential to serve as useful biomarkers for pre-CD diagnosis. Understanding the role of the gut microbiota in CD onset may open novel avenues to understand disease pathogenesis and reveal new preventive and treatment models. In addition to video lm presentation, oral and written informed consents were obtained from the parents of the children included in the study.

Consent for publication: Not applicable
Availability of data and materials: All OTU-related data analyzed in this study are included in this published article (Additional le 2: Tables S2). The 16S rRNA gene sequencing raw data generated in this study is available through the NCBI Sequence Read Archive Bioproject PRJNA631001. The plasma metabolomics data are included in this published article (See Additional le 14: Table S8). The gut microbiome analysis codes generated in this study is available at this link: https://github.com/jdreyf/celiac-gut-microbiome.
Competing interests: J.F.L coordinates a study on behalf of the Swedish IBD quality register (SWIBREG) and this study has received funding from Janssen Corporation.