Chemicals and reagents. Standard chow diet (1010001, Jiangsu Xietong Pharmaceutical Bio-engineering Co., Ltd.), control diet (D12450J, Research Diets Inc., New Brunswick, NJ, USA) contained 10% kcal% fat, while the high fat diet contained With 60 kcal% fat (D12492, HFD, Research Diets Inc., New Brunswick, NJ, USA). DiCQAs were prepared according to our previously reported [16]. The primary antibodies against FXR (#72105), p-AKT (Ser473, #4060), AKT (pan, #4685), p-PI3K (Tyr458, #4228), PI3K (p85, #4257), p-GSK3β (Ser9,#5558), GSK3β (#12456) ,p-AMPK (Thr172, #2535), AMPKα (#5831) and IRS-1 (#2382) were purchased from Cell Signal Technology (Beverly, MA). The primary antibodies against FGF15 (ab229630), FGFR4 (ab178396), TGR5 (ab72608), MUC2 (ab272692), Occludin (ab216327) iNOS (ab178945), SREBP2 (ab30682), TNF-α (ab183218), IL-1β (ab254360) and Caludin (ab211737) were purchased from abcam (Cambridge, MA, USA). The primary antibodies against FABP6 (ER62027), SREBP1 (ER1917-19), NK-κB (p65, ET1603-12) p-ERK(Erk1 (T202) + Erk2 (T185), ET1603-22), EKR1/2 (ET1601-29),p-JNK1/2/3 (T183 + T183 + T221, ET1609-42), JNK1/2/3 (ET1601-28), p-P38 (Thr180 + Tyr182) and P38 (ET1702-65) were purchased from HUBIO (Wuhan, China). The primary antibodies against a-Tubulin (AF2827), β-actin (AA128), GLP-1R (AF6996), PPARγ (AF7797) and PEPCK (AF7689) purchased from Beyotime Institute of Biotechnology (Shanghai, China).
Animal study. (1) The animal experiments strictly adhered to the Laboratory Animal Standards of Welfare and Ethics (DB32/T-2911-2016) and received approval from the Institutional Animal Care and Use Committee at Nanjing Agricultural University (SYXK-2017-0027). Specific-pathogen-free (SPF) male mice, aged six weeks, were sourced from Hangzhou Ziyuan Laboratory Animal Technology Co., Ltd. (SCXK(Zhejiang)2019-0004, Hangzhou, China). All mice were housed within a controlled environment conforming to SPF standards. This environment encompassed a 12-hour light-dark cycle, maintained at a temperature range of 20–22°C, alongside a relative humidity level of 45 ± 5%. After a one-week acclimatization period, mice were subjected to varying dietary conditions. The first group served as the control and fed at control diet (D12450J) while the remaining three groups were fed HFD (D12492). Following a period of six weeks on their respective diets, excluding the control group, the mice underwent intraperitoneal administration of STZ at a dose of 50 mg/kg, freshly prepared and dissolved in pre-frozen citric acid buffer at a pH of 4.5, repeated three times on alternate days in a week. The control group received an equivalent volume of citrate buffer as a control. On the third and seventh day’s post-STZ injection, fasting blood glucose (FBG) levels were quantified using a Roche Accu-Check Performa blood glucose meter (Roche Diagnostics (Shanghai) Co., Ltd., Shanghai, China). Mice exhibiting FBG levels of 16.7 mmol/L (based on two times measurement) were identified as diabetic and subsequently selected for the ensuing experiment. Diabetic mice were subjected to random assignment into three distinct cohorts: the Diabetes Mellitus (DM) group, the Low-Dose diCQAs (DL) group administered at a dosage of 200 mg/kg, and the High-Dose diCQAs (DH) group administered at a dosage of 400 mg/kg. Each group consisted of eight mice, housed in pairs within separate cages. The administration was conducted once daily via gavage over a duration of six weeks. Regular assessments were made regarding body weight, food intake, and water consumption on a weekly basis. Fasting blood glucose (FBG) levels were monitored biweekly. A seven-day period prior to the conclusion of the study was dedicated to the execution of an Oral Glucose Tolerance Test (OGTT) [17]. At the conclusion of the experiment, following a 12-hour fasting period, the mice were humanely euthanized using CO2 anesthesia. Subsequently, plasma, liver, colon, muscle and ileum tissue and fecal samples were collected kept in liquid nitrogen and preserved at -80°C until analysis.
(2) The long-term group adhered to feeding with standard chow diet. Each group consisted of 8 mice, housed in pairs within separate cages. After a one-week acclimatization period, diCQA was administered via gavage to the LND group at a dosage of 200 mg/kg, consistent with the dosage given to the DL group. The control group (LNC) received an equivalent dosage of distilled water. OGTT and blood glucose assessments were conducted in accordance with previously described methods, spanning a duration of 12 weeks, at the end of this experiments, mice were fasted overnight before being euthanized by CO2. Subsequently, plasma, liver, colon, muscle and ileum tissue and fecal samples were collected kept in liquid nitrogen and preserved at -80°C until analysis.
AGP cohort and MAGs study. The American Gut Project (www.americangut.org) undertook sample processing, sequencing, and core amplicon data analysis. All amplicon sequence data and associated metadata have been publicly disseminated through the AGP data portal (qiita.ucsd.edu; study ID 10317) [18]. Briefly, data filtration were executed in Qiime2 [19] pipline, wherein sequences with an aggregate read count less than 5000 and an inter-sample occurrence rate below 10%, in accordance with established empirical findings bloom sequences were also removed [20] by exclude the HASH value produced during the qiime2 process. Following this, original GTDK 202 database which trained by RESCRIPt (v 2023.2.0) was applied for taxonomy assignment [21]. Within AGP metadata, individuals manifesting BMI values within the range of 18 to 25 are categorized as the "normal" group (n = 6016), while those falling within the BMI range of greater than 25 to 30 are designated as the "overweight" group (n = 2516). Individuals with a BMI surpassing 30 are classified as the "obese" group (n = 884). Moreover, a subset of 186 individuals diagnosed with diabetes mellitus are delineated as the "T2DM group."
For Metagenome-Assembled Genomes (MAGs) analysis, sequences were obtained from the NCBI or ENA database. Trimmomatic (v0.33) was employed to remove contaminated sequences, including those originating from the sequenced organism itself or adapters. Subsequently, MEGAHIT (v1.1.1) was utilized for sample-by-sample assembly, with a minimum contig length parameter set at 500 and 20 processing threads. Sequences shorter than 1.5 kilobases were excluded from further consideration.To maximize the recovery of Bins, DAS Tool (v1.1.6) was employed for binning each sample, utilizing the parameters "-l concoct,maxbin,metabat –threads 20 –score_threshold 0.8". The refinement of MAGs involved the application of dRep (v3.4.2) to mitigate redundancy within each Bin [22], resulting in the acquisition of high-quality MAGs representing individual metagenomic species to the greatest extent possible, employing an Average Nucleotide Identity (ANI) cutoff of 0.95. The specific dRep running parameters were set as follows: "-pa 0.9 - sa 0.95 –cm larger -con 10". For near-complete MAGs, the -nc parameter in dReap to 0.60. For medium-quality MAGs possessing a Quality Score (QS) greater than 50, this parameter was adjusted to 0.30 [23].A total of 1081 bins underwent redundancy removal using dRep, resulting in the retention of 615 non-redundant bins and CheckM (v1.1.2)[24] was used to determine the quality of each bin. The supplementary table S5 contains crucial data to gene quality statistics, encompassing metrics such as contig N50, contig count, average contig length, open reading frame (ORF) count, tRNA and rRNA gene counts, as well as contig read depth, following the methodology described by Xie et al [25]. Relative bin abundances in each sample were calculated using CoverM, with parameters set as: "-m relative_abundance –min-read-aligned-percent 0.75 –min-read-percent-identity 0.95 –min-covered-fraction 0". This approach, which employs stringent identity criteria and truncates alignments, serves to mitigate result inflation due to spurious mappings. For phylogenetic analysis, we used PhyloPhlAn (v3.0.3) [26] to construct a phylogenetic tree from 615 high-quality Bins. The universal markers for prokaryotes utilized in this tree were sourced from Nicola et al.'s work, subsequently, protein sequences within the MAGs were aligned using MUSCLE (v3.8.31), and then inferred using FastTree (v2.1.9). ITOL was employed to visualize the phylogenetic trees. The taxonomic annotation of all MAGs was performed utilizing GTDB-Tk (v.2.1.0), relying on the database (v202).To function annotation, For functional annotation, all protein sequences predicted by Prodigal(v2.6.3) subjected to functional annotation using the default settings of KofamScan (v.1.1.0) and EggNOG-mapper (v2.1.10). KO numbers relevant to bile acid (BA) metabolism are: Bile salt hydrolase (BSH, ko.K01442), the hydroxysteroid dehydrogenases (HSDHs) 7α-HSDH(ko.K00076), 7β-HSDH(ko.K23231), 3α-HSDH(ko.K22604), 3β-HSDH(ko.K22607), 3alpha-hydroxy BA-CoA-ester 3-dehydrogenase (baiA,ko.K15869), BA-coenzyme A ligase (baiB, ko.K15868), 3-oxocholoyl-CoA 4-desaturase (baiCD, ko.K15870), BA 7alpha-dehydratase (baiE, ko.K15872), BA CoA-transferase (baiF, ko.K15871), 3-dehydro-bile acid Delta4,6-reductase (baiN, ko.K07007), 7beta-hydroxy-3-oxochol-24-oyl-CoA 4-desaturase (baiH, ko.K15873), and BA 7beta-dehydratase (baiI, ko.K15874). The carbohydrate-active enzyme (CAZyme, http://www.cazy.org/) profile for each MAG was forecasted utilizing the CAZyme database [38], employing a Hidden Markov Model (HMM) approach facilitated by HMMER, diamond and dbCAN_sub, we selectively considered results obtained from all three tools. We applied CD-HIT (v4.8.1) to generate a Sequence Similarity Network (SSN) for a total of 514 BSH genes within MAGs [27]. The BSH homologs were processed using CD-HIT with parameters set at a 40% identity threshold[28]. BWA-MEM2 (v2.2.1) [29] is employed for determining gene abundance in Metagenome-Assembled Genomes (MAGs). Briefly, establish index for each bin to the comparison with all sequencing data. Subsequently, samtools (v1.18)[30] and ‘jgi_summarize_bam_contig_depths’ get mapped reads. Finally, R (v4.3.0) was utilized to perform the transcripts per kilobase million (TPM) calculation process.
Plasma samples analyses. Fasting plasma glucose (FPG), total cholesterol (TC), triglyceride (TG), high-density lipoprotein cholesterol (HDL-C), low-density lipoprotein cholesterol (LDL-C), blood urea nitrogen (BUN), alanine aminotransferase (ALT), and aspartate aminotransferase (AST) levels in plasma were assessed using commercially available kits obtained from Nanjing Jiancheng Bioengineering Institute (Nanjing, China).Cytokine levels including tumor necrosis factor-alpha (TNF-a), interleukin-1b (IL-1 b), IL-6, and IL-10 in plasma were quantified using enzyme-linked immunosorbent assay (ELISA) kits from NeoBioscience Technology Company (Shenzhen, China). Insulin (INS) levels in plasma were determined using an ELISA kit from Mercodia AB (Uppsala, Sweden), while glucagon-like peptide 1 (GLP-1) levels in plasma were measured with an ELISA kit from Cusabio (Wuhan, China).
Bile acids and cholesterol analyses. A variety of bile acid (BA) species were quantified based on sample type, encompassing cholesterol, cholic acid (CA), α-muricholic acid (αMCA), β-muricholic acid (βMCA), ω-muricholic acid (ωMCA), chenodeoxycholic acid (CDCA), deoxycholic acid (DCA), hyodeoxycholic acid (HDCA), ursodeoxycholic acid (UDCA), lithocholic acid (LCA), taurocholic acid (TCA), tauro-α-muricholic acid (T.α.MCA), tauro-β-muricholic acid (T.β.MCA), taurodeoxycholic acid (TDCA), taurochenodeoxycholic acid (TCDCA), TUDCA, glycochenodeoxycholic acid (GCDCA), glycodeoxycholic acid (GDCA), and glycocholic acid (GCA). Standards for these BAs, including T.α.MCA, T.β.MCA, TDCA, CA-d4, GCDCA-d4, and DCA-d4, were procured from SHANGHAI ZZBIO CO., LTD. (Shanghai, China), while the remaining BA standards were obtained from Beijing Solarbio Science & Technology Co., Ltd. (Beijing, China). BAs in plasma, liver and feces were measured according to previous report(AB/Sciex QTrap 5500 LC-MS/MS system equipped with a Kinetex Biphenyl column 2.1 × 100 mm, 2.6 um, Phenomenex) [31]. Negative ion mode was used for all analyses except cholesterol, which was analyzed using positive ions in combination with APCI source [32], and data processing was performed using SCIEX OS-Q (Include quality-control and data preprocessing, including batch-effect adjustment and missing value imputation).
RT-qPCR and westen blot analyses. The total RNA from colon tissue was extracted using FastPure Total RNA Isolation Kit (Vazyme, Nanjing, China) in accordance with the manufacturer's instructions. Subsequently, qRT-PCR was performed, following our established protocol. All results were calculated with the 2−ΔΔCt method with the normalization by GAPDH. For western blot analysis method: Sample Preparation: Total protein from the liver, colon, and ileum was extracted using radioimmunoprecipitation assay buffer supplemented with 1 mmol/L phenylmethylsulfonyl fluoride; Protein Quantification: Protein concentration was determined using a BCA kit (Vazyme, Nanjing, China); Gel Electrophoresis: Protein extracts were separated on 10% or 12% sodium dodecyl sulfate-polyacrylamide gels at 80-110V; Transfer and Blocking: Proteins were transferred to Polyvinylidene difluoride (0.22um) filter membranes (Millipore, USA) at 0.25A; Membranes were blocked in Tris-buffered saline and 0.1% Tween 20 buffer supplemented with 5% skim milk for 2 hours. Then membranes were incubated with primary antibodies at 4°C overnight. Afterward, membranes were washed three times with TBST solution and then secondary antibody incubation 2h with HRP-linked secondary antibody anti-rabbit/mice IgG, all membranes were scanned using ECL plus solution (Affinity Biosciences, Changzhou, China), and target band density was quantified using Image J software.
16S rRNA gene sequencing of gut microbiota and Acetatifactor.spp processing. The fecal genomic DNA was extracted using the FastDNA SPIN Kit (MP Biomedicals, Santa Ana, CA, USA). Subsequently, the 16S rRNA gene's V3-V4 hypervariable regions were targeted for amplification with the primers 341F (5’-CTACGGGNGGCWGCAG-3’) and 805R (5’-GACTACHVGGGTATCTAATCC-3’), followed by sequencing on an Illumina NovaSeq 6000 sequencer. All analyses were conducted using the QIIME2 (v2023.5) pipeline [19]. After quality control, 230 bp of the paired ends were retained. Dada2 was employed to generate representative sequences and record HASH values for subsequent sequence extraction. Diversity was calculated using q2-diversity, retaining ‘Unweighted UniFrac’ for further analyses. Taxonomic classification (sklearn v 0.24.0) was conducted using the same database as AGP taxonomy.
A total of 152 sequences from Acetatifactor.spp were download from NCBI, in addition to 7 sequences from MAGs for further analysis. The phylogenic tree and functional annotation results were acquired using the same methodology as described previously for MAGs. The 16S rRNA sequences were isolated using Barrnap (v0.9), followed by ‘makeblastdb’ in BlastN (v2.1.1), then comparison against the Acetatifactor.spp reprsent 16S rRNA sequences[33].
The Co-Abundance Group (CAG). Algorithm according to Liu et al. [34]: The OTUs that demonstrated a minimum presence of 20% across all samples, coupled with a collective abundance exceeding 0.01%, were selected. The SparCC algorithm [35], iterated over 100 bootstrap procedures, was applied to ascertain correlations among these selected OTUs. Subsequently, only OTUs exhibiting correlation scores surpassing 0.4 and a significance threshold of P < 0.05 were delineated as Co-Abundance Groups (CAGs, n = 123). The correlation values underwent transformation into a correlation distance (1-correlation value), and OTUs were subjected to clustering employing the Ward clustering algorithm implemented in the R package WGCNA (v1.72-1). The resultant CAG network was visually represented using Cytoscape (v 3.2.1).
Statistical analysis. Multiple-regression analysis by forward selection and single regression analysis was used to identify the gut microbiota associated with BMI and T2DM in AGP corhort (linear models in MaAsLin2 (v1.7.3))[36]. The output from the QIIME2 pipeline in Biom table format was imported into R (v4.3.0) (https://www.R-project.org/) for analysis. Alpha-diversity indices were calculated using the estimate_richness function from the R package ‘phyloseq’. For beta-diversity, the Bray–Curtis distance of genus-level data was computed using the vegdist function from the R package ‘vegan’. Principal Coordinate Analysis (PCoA) was conducted with the dudi.pco function from the R package ‘ade4’. PERMANOVA was employed to assess the statistical significance between groups, utilizing 9999 permutations. For the determination of the variation attributed to each of the collected host factors in diabetic groups, we conducted an Adonis test within R. Unless otherwise specified, statistical significance to lipid and glucose metabolism-related parameters and pro-inflammatory mediators in plasma, BAs and gut microbiota were conducted by using Mann-Whitney U and Kruskal-Wallis, respectively (rstatix v0.7.2), employing a Benjamini-Hochberg (BH) adjustment for multiple comparisons, with a significance threshold of p < 0.05. Functional gene differences analysis in MAGs were executed through DESeq2 (v1.40.2), with criteria set at log2FoldChange > 1 and an adjusted p-value (padj) < 0.05. For the assessment of correlations between BAs and CAGs, the psych package (v2.3.6) was employed, utilizing the Spearman correlation method, and applying Benjamini-Hochberg adjustment for multiple testing (padj < 0.05). All results were visualized using ggplot2 (v3.4.2) and Adobe Illustrator 2020.