Gut microbiome alterations predict diabetic kidney disease in general population

We collected fecal samples of 180 patients with DKD and 179 healthy controls (Con) and characterized microbial prole using 16S rRNA gene sequencing approach. Microbial communities between the two groups were compared by Wilcoxon test. Two microbial models were identied by cross-validation preformed on random forest analysis. Diagnostic value of the two models was assessed in the training set including 118 DKDs and 132 healthy controls and validated in testing set composed of 62 DKDs and 47 healthy controls. PICRUSt was exploited to compare the potential metabolic function of microbiota between two groups. Our study suggested that gut

reduction of bacteria producing short-chain fatty acids (SCFA) such as Blautia Faecalibacterium were also observed in DKDs. The optimal prediction model consisting of 5 OTUs could distinguish DKD from healthy controls with an area under the ROC curve of 83.71% in training set and 81.56% in testing set, while another 22 OTUs couldn't signi cantly improve diagnostic e ciency. Genomic metabolism associated with lipid and aromatic amino acids was over-expressed in DKD vs. Con.

Conclusion
Gut microbiome was altered in DKD patients. Our study suggested that gut microbiome may provide a potential tool for DKD diagnosis, especially those who could not accept or afford renal biopsy. Background Page 3/12 Diabetic kidney disease (DKD), occurring in 30%-40% diabetes population, has become a public health problem worldwide due to the leading cause in end-stage renal disease (ESRD) [1]. As reported in 2015, DKD has surpassed glomerulonephritis to become the leading cause of chronic kidney disease (CKD) in Chinese hospitalized population (1.10% versus 0.75%) [2]. While, it is inaccurate to identify DKD only by appearance of micro-albuminuria or decreased glomerular ltration rate (GFR) in diabetic patients. It is reported that among the patients undergoing renal biopsy,DKD patients accounts for only 25% approximately. Additionally, laboratory indices lack effectiveness for early DKD diagnosis. Thus, novel test for early DKD identi cation is required.
Known as "endogenous microbial organ", commensal bacteria colonized in gut ferment indigestible carbohydrates, repair impaired intestinal barrier, competitively inhibit growth of "harmful" bacteria, participate in immune maintenance and access nutrients from host to drive host-microbe mutualism [3,4].
Disruption of this stable ecosystem, characterized by increased potential pathogens or decreased "bene cial" bacteria, probably implies imbalance induced by disease. Accumulating researches have suggested that potential function of gut microbiome is closely related to fatty acid metabolism, oxidative stress resistance, regulation of insulin sensitivity in diabetes, as well as arteriosclerosis and thrombosis in atherosclerotic diseases [5][6][7][8]. Moreover, approach using microbial markers for diseases identi cation has been proposed and proven feasible. Diagnostic model consisting of genera Escherichia-Shigella and Prevotella_9 can distinguish 14 DNs from 14 DMs with 92% accuracy [9]. Although the crucial role of gut microbiome in diseases development has been established within or beyond nephrology, microbial composition and function in DKD patients remains rarely reported. Therefore, a large cohort of 180 DKD patients and 179 healthy controls are enrolled in our study to characterize gut microbiome, predict microbiome-associated function and develop a potential DKD classi er based on gut microbiome.
Identi cation of structural and metabolic alteration of gut microbiome may help advance our understanding of gut-kidney axis.

Patient selection
We clustered quality-ltered sequences into unique sequences and identi ed abundance of representative sequences in a descending order using U parse analysis. Sequences with 97% similarity to representative sequences were clustered and attributed to the same operational taxonomic units (OTUs). We nally obtained all OTUs of each sample from 180 patients with DKD and 179 healthy controls. The phylogenetic pro le of each OTU was assigned against the Silva (SSU123)16S rRNA database by RDP Classi er (http://rdp.cme.msu.edu/) with a con dence threshold of 70%. Data process of raw sequences was described in Supplementary le 1.
Depth of sequencing per sample was assessed by Rarefaction curve. Adequacy of sample size of DKD group and Con group was evaluated by species accumulation curve. Ace/Chao index and Shannon/Simpson index were used to calculate OTU richness and bacterial diversity respectively. Both principal coordinate analysis (PCoA) and non-metric multidimensional scaling (NMDS) plots were generated to visualize the (un)weighted UniFrac dissimilarity, corresponding statistical signi cance of which were tested using permutational multivariate analysis of variance (PERMANOVA) and analysis of similarities (ANOSIM) (QIIME package). The linear discriminant analysis (LDA) effect size (LEfSe) was applied to identify different microbial taxa and pathways (LDA score > 2.0 and P value for Wilcoxon test <0.5). Microbial metabolic pathways in both groups were identi ed using PICRUSt against Greengenes database.

Statistical analysis
Bacterial comparison from phylum to genus level between DKD and Con group were conducted using non-parametric Mann-Whitney U test. To identify microbial markers with predictive potential, Wilcoxon test was used to calculate signi cance of the OTUs (P <= 0.05 and Q <= 0.05), whose abundance in total was larger than 0.001. After preliminary screening, parameters of importance were assigned to all potential OTU biomarkers by random forest. As a result, a number of 382 OTUs were identi ed as key linkage for cross-validation. With ve trails of the ve-fold cross-validation, the input features consisted of 5 OTUs were regarded as the optimal set with the minimum cv-error rate. Finally, the 5 OUTs' abundance pro le of training set and testing set (as mentioned above) was used to calculate POD index for each sample. Area under the receiver operating characteristic (ROC) curve (AUC) was used to assess diagnostic effect. Similarly, microbial model consisted of 22 OTUs were identi ed as an alternative set when abundance index was defaulted as 0.005. All statistical methods mentioned above were completed in R package.

A loss of microbial diversity in DKD
Through the rigorous clinical inclusion and exclusion criteria, a total of 359 fecal samples were obtained for our study, including 179 healthy controls and 180 DKD patients. In agreement with Figure S1A, B, an average of 28349 reads per sample were collected in DKD group, lower than that 29348 of Con group. With su cient sample size, number of detected OTUs showed a marked decrease in DKD compared with healthy controls ( Figure S1C). For instance, there were total 2331 OTUs in DKD group, while 2997 in Con group ( Figure 1A). Rank-abundance curve indicated a relatively uniform and diverse OTU pro le in Con group ( Figure S1D). By calculating index of alpha diversity, we observed a decrease of OTU richness (Ace for DKD vs. Con: 468.60 vs. 545.21; P=1.34e-05; Chao for DKD vs. Con: 542.97 vs. 453.49; P= 1.08e-06) and diversity (Shannon for DKD vs. Con: 3.65 vs. 3.62; P=5.19e-06; Simpson for DKD vs. Con: 0.09 vs. 0.14; P=0.0048) in DKD compared with Con ( Figure 1B-E). To visualize spatial distribution of gut microbiome among all samples, (un)weighted UniFrac algorithm was used to compute beta diversity. PCoA analysis revealed signi cant difference between DKDs' and healthy controls' microbial community (PERMANOVA test for unweighted UniFrac distance: R2=0.0074, P=0.001; Figure 2F; PERMANOVA test for weighted UniFrac distance: R2=0.0036, P=0.197, Figure S2A). Same results were also found in NMDS analysis based on Bray-Curtis distance (ANOSIM test: R=0.01342, P=0.004; Figure S2B) or unweighted UniFrac dissimilarity (ANOSIM test: R=0.02529 P=0.001; Figure S2C).

Dysbacteriosis in gut of DKD
At the phylum level, proportion of the four dominant bacteria (i.e., Firmicutes, Bacteroidetes, Proteobacteria and Actinobacteria) could achieve 95% of all sequences ( Figure S3A). Compared with Con group, relative decrease of phyla Bacteroidetes and relative increase of Proteobacteria were observed in DKD, accompanying with increased Firmicutes/Bacteroidetes ratio (Figure 2A). At the class level, bacteria including Gammaproteobacteria, Verrucomicrobiae and Erysipelotrichia were abundant in DKD, while Betaproteobacteria and Bacteroidia were relatively decreased ( Figure 2B). Correspondingly, a total of 16 bacteria at the order level and 31 bacteria at the family level were identi ed as signi cantly different between two groups ( Figure 2C, 2D). In more detail at the genus level, we observed relative increase of potential pathogens consisting of Peptostreptococcus, Streptococcus, Peptostreptococcaceae_incertae_sedis, Erysipelotrichaceae_incertae_sedis, Enterococcus and Escherichia-Shigella, enrichment of sulphate-degrading bacteria Desulfovibrio, as well as decrease of "bene cial bacteria" including Roseburia, Blautia, Bacteroides and Faecalibacterium in DKD patients. Notably, relative abundance of Akkermansia, a mucin-degrading bacteria, was also reversely increased in DKD ( Figure 2E, Figure S3B).
To evaluate microbial differences at the taxonomic level, LEfSe analysis was performed on microbial content of the 180 DKD patients and 179 healthy controls (LDA score >2.0, P value <0.05). Besides those mentioned above, decreased abundance in genera including Lachnospira, Pseudobutyrivibrio and Anaerostipes and increased abundance in Bi dobacterium, Lactobacillus were observed in DKDs. We also demonstrated that genera Bacteroides (LDA=4.90, P=1.95e-19) and Escherichia_Shigella (LDA=4.65 P=7.92e-25 were strongly correlated with the disease status. All results above showed gut microbiome of DKD remarkably deviated from that of healthy status. ( Figure S4).

DKD identi cation based on OTUs
Considering microbial striking difference in DKDs compared with healthy controls, a strategy based on potential microbiota-targeted markers for distinguishing DKD from healthy status was proposed. As mentioned in 'Statistical analysis' section, we performed random forest analysis on more than 1000 OTUs (number of sequences > 0.0001) selected by Wilcoxon test (P<0.05 and Q<0.05). As a result, 328 OTUs were identi ed as key markers and distribution of the rst 73 OTUs (importance value >0.001) in 180 DKD patients and 179 healthy controls were shown in heatmap. Differential relative abundance between DKD and Con was observed (Figure 3). We also delineated the pro le of top abundant 50 OTUs in each sample and found a relatively insigni cant difference between DKD and Con group ( Figure S5).
With ve-fold cross validation performed on random forest model, an OTU combination consisted of 5 OTUs (model 1) was identi ed with the highest diagnostic value. As shown in Figure 4A, this optimal set had the minimum cv error-rate. These OTUs' contribution and stability in construction of disease classi er were respectively evaluated by decreasing accuracy and Gini ( Figure 4B-C). Same identi cation method was further implemented on a range of OTUs whose abundance larger than 0.0005 and 22 OTUs were identi ed as another microbial combination with diagnostic value (Model 2; Figure 4D). Their function in model construction was shown in Figure 4E-F. We applied ROC curve in the training set containing 118 DKD patients and 132 healthy controls and found combination of 5 OTUs could separate DKDs from healthy controls with 83.71% accuracy, while 81.56% in model 2 ( Figure 5A). Although average POD values in DKD group were higher than Con group in both OTU sets, there was a more signi cant discrepancy in model 1 ( Figure 5B, 5C). Remaining 62 DKDs and 47 healthy controls were incorporated into testing set to assess diagnostic e ciency of the two OTU sets. As expected, area under ROC curve (AUC) could achieve 80.89% in model 1 and 76.75% in model 2. Higher POD value in DKD than Con group was also observed in Figure 5D, 5E. These data suggested that increasing OTU markers couldn't improve predictive performance.

Microbiome-associated functional changes in DKD
PICRUSt was used to analyze microbiome-associated metabolism in disease or healthy status. A total of 145 KEGG pathways showed distinct relative abundance between DKD and Con group (LDA score >2.0 and P value <0.05). Functional categories including membrane transportation (ABC transporters and phosphotransferase system), cell motility (bacterial motility proteins), lipid metabolism (fatty acid, glycerolipid, glycerophospholipid metabolism and unsaturated fatty acids biosynthesis), amino acid metabolism (aromatic amino acids, glutathione and others metabolism) and carbohydrate metabolism (pyruvate metabolism) showed higher levels in DKDs, while metabolism of cofactors, vitamins and speci c amino acids (i.e., alanine, aspartate, glutamate, arginine and proline), oxidative phosphorylation were decreased in DKDs compared with healthy controls (Figure 6).

Discussion
In this study, we collected fecal samples of 180 DKD patients and 179 healthy individuals and used highthroughput 16S rRNA Miseq to complete the gut microbial pro le mapping. Firstly, we elaborated on the obvious difference in diversity and abundance of gut microbes between DKD patients and healthy individuals. Compared with healthy individuals,the abundance and diversity of DKD patients' intestinal microbiota is relatively lower. Similar results were found in other kidney diseases such as chronic kidney disease and IgA nephropathy [10,11]. At the phylum level, the proportion of Proteobacteria and Verrucomicrobia are signi cantly increased, while Bacteroidetes decreased; at the genus level, the proportion of opportunistic pathogenic bacteria like Escherichia-Shigella, Klebsiella, Enterococcus, Erysipelotrichaceae_incertae_sedis, Peptostreptococcaceae_incertae_sedis, Peptostreptococcaceae_incertae were relatively increased. Secondly by detecting changes in abundance of g_Lactobacillus, g_Sphingomonas, g_Gardnerella, g_Erysipelotrichace_Incertae_sedis, g_Lachnospiraceae_unclassi ed diabetic kidney disease can be accurately differentiated with the Health control group which has been matched with gender and age (AUC = 0.8089). In this study, compared with normal individuals, G Bacteroides and G Faecalibacterium were signi cantly decreased in DKD patients, whereas Escherichia-Shigella was signi cantly increased. Intriguingly, in Tao's study [12], genus Bacteroides was higher in DKD, while genus Bacteroides was signi cantly lower in DKD than in HC group in our study.
The metabolites of gut microbiota play pivotal roles in pathogenesis and health maintaining [13][14][15], especially SCFAs, ammonia, aromatic amino acids, butyric acid, indole and p-cresol [16,17]. The single layer of epithelial cells that makes up the mucosal interface between the host and microorganisms allows microbial metabolites to gain access to and interact with host cells, and thus in uence immune responses and disease risk [15]. Compared with HC group, DKD group's metabolism of amino acids, particularly glutamine, were signi cantly decreased. A study reported amino acids level in intestine was lower, which may be due to the large amount of nitrogen resources is needed to maintain protein synthesis during gut microbiota proliferation or intestinal tissues growth [16,18]. It is reported that intestinal microbes have critical role in the normal development of organized lymphoid structures and in the regulation of host immune function [15,[19][20][21]. It has recently become evident that individual commensal species in uence the makeup of lamina propria T lymphocyte subsets that have distinct effector functions [15].
The study by Jiang S et al [22] reported that Coprococus and Faecalibacterium were signi cantly lower in CKD than the normal group. Similarly, compared with HC group, they were signi cantly lower in DKD group. Coprococcus and Faecalibacterium belong to Firmicutes-Clostridiales and can produce butyrate. Butyrate regulates T cells by up-regulating H3 and H4 histone acetylation of Fox3 site, which in turn affects the process of systemic in ammation and diabetic kidney disease. Butyrate production is in uenced by diet [23], but study by Tao S et al [12]showed gut microbiota composition of Household (HH) and Healthy Control groups were similar, but signi cantly different with DN group under identical diet, which indicates that diseases play an important role in promoting gut microbiota composition alterations. However, study by Poesen.R et al. [24]found that the gut microbiota composition of Household group and the CKD group, which have the same diet, gender, age and BMI, were similar. This study only included DKD patients without other comorbid diseases, while in real clinical practice, most of DKD patients combine with multiple chronic diseases (e.g. cardiovascular diseases, tumors, Alzheimer's disease, CKD etc.) which may also in uence abundance and diversity of gut microbiota [7,[25][26][27][28]. The correlation between gut microbiota composition and lifestyle or speci c medications need to be further clari ed at the same time. Therefore,further studies on determining characteristic intestinal microbes markers for DKD diagnosis are required.

Conclusion
This study found structural and metabolic alterations of DKD patients' gut microbiota. Based on this, the study may provide a potential non-invasive diagnostic method for DKD by identifying the unique gut microbiota composition. It may also be the option of early screening and prediction for DKD in community healthy individuals. and 81700633). The funders designed the study and approved publication of manuscript.

Availability of data and materials
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.
Ethics approval and consent to participate The present study was conducted in accordance with the Declaration of Helsinki and was approved by the Institutional Review Board of the rst a liated hospital of Zhengzhou university, school of medicine(2019-KY-361). All patients provided written informed consent.

Consent for publication
Written informed consent from the reported patient for publication was obtained.

Competing interests
The authors declare that they have no competing interests.

Footnotes
Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional a liations.