Gut Microbiome Analysis As A Non-Invasive Tool for the Early Diagnosis of Cholangiocarcinoma

The liver-microbiome axis is implicated in the pathogenesis of hepatobiliary cancer, and the role of the gut microbiota in cholangiocarcinoma (CCA) remains unclear. We conducted a case-control study on the intestinal ora of 33 CCA patients and 47 cholelithiasis individuals. We performed 16S rRNA gene sequencing to identify disease-related gut microbiota and assess the potential of the intestinal microbiome as a non-invasive biomarker for CCA. RESULT We found that gut microbiome of CCA patients had a signicantly higher alpha diversity (Shannon and Observed species indices, p = 0.006 and p = 0.02, respectively) and an overall different microbial community composition (p = 0.032). The genus Muribaculaceae_unclassied was most strongly associated with CCA (p < 0.001). We put forward a disease predictive model including twelve intestinal microbiome genera distinguished CCA patients from CF patients with an area under curve (AUC) of approximately 0.93 (95%CI, 0.85–0.987). The forecasting performance of this model was better than CA19-9. Moreover, genera Ezakiella and Garciella were only observed among intrahepatic cholangiocarcinoma patients. Further, we assessed predicted functional modules alternations CCA patients and uncovered a microbiota pattern specic to CCA.

Recent evidence indicates that the intestinal microbiota is a critical environmental factor for the pathogenesis of liver diseases through the gut-liver axis [8][9][10][11][12]. The gut microbiome composition reportedly differs among diseases and is implicated in the occurrence and progression of numerous cancers [13][14][15]. Evidence from the last decade has increasingly suggested that the gut microbiota plays a crucial role in the progression of liver cancer [16]. In some animal models, the severity of liver diseases reportedly depends on the in ammatory cancer-promoting microenvironment [17,18]. The in ammatory signals emerging from an altered gut microbiome have been considered a new potential carcinogenic mechanism [19]. These ndings indicate the diagnostic potential of microbiota for CCA. However, relatively few studies have investigated the characteristics of the intestinal microbiota among CCA patients [20]. Moreover, the distribution of intestinal ora cannot be distinguished between CCA patients and individuals with other liver diseases [21,22].
Here, we conducted a case-control study to identify the intestinal ora of 33 CCA patients and 47 cancerfree (CF) individuals via 16S rRNA gene sequencing. We analyzed and revealed differences in the microbial spectrum of CCA and CF groups. Further, we characterized the intestinal microbiome of ICC, PCC, and DCC patients. Ultimately, this study elucidates the speci c gut microbiome composition that can be considered a non-invasive diagnostic biomarker for CCA.

Methods
Patient recruitment. Forty-seven cholelithiasis patients without cancer and 33 patients with CCA diagnosed in accordance with the National Comprehensive Cancer Network (NCCN) guidelines [23] and histologically con rmed between December 2018 and January 2020 at the First A liated Hospital of Wenzhou Medical University. The exclusion criteria were as follows: (1) ≤ 18 years old or ≥ 80 years old.
(2) history of other malignancies. (3) autoimmune diseases. (4) receiving any chemoradiation, interventional, or immunological therapy. (5) receiving antibiotics or probiotics within the past 8 weeks [24]. (6) malabsorption, in ammatory bowel disease, symptoms of gastrointestinal obstruction, or diarrhea owing to a bacterial infection within 6 months. (7) special dietary habits, including vegetarianism or any diet restriction. The comprehensive baseline demographic and clinicopathological data of the participants, including age, sex, body mass index (BMI), smoking and alcohol habits, history of cirrhosis and hepatitis B virus infection, serum tumor markers, and liver function indicators, are presented in Fig. 1 and Supplementary Table S1 [25].
Sample collection, DNA extraction, and 16S rRNA gene sequencing. All fecal samples from participants were freshly collected before treatment during hospital stay and immediately frozen and stored at -40 °C within 3 h of sampling [26]. To mitigate the effect of a sampling bias, the middle portion of fecal matter was sampled in all cases. Bacterial genomic DNA was extracted using the E.Z.N.A.® Stool DNA Kit (D4015, Omega, Inc., USA). The V3-V4 region of the 16S rRNA gene was ampli ed using slightly modi ed versions of the primers 341F (5'-CCTAGGGNGGCWGCAG-3') and 805R (5'-GACTACHVGGGTATCTAATCC-3'). DNA was extracted in ultrapure water to exclude the possibility of false-positive PCR results. The PCR products were puri ed using AMPure XT beads (Beckman Coulter Genomics, USA) and quanti ed using Qubit (Invitrogen, USA). The amplicon pools were prepared for sequencing using the Agilent 2100 Bioanalyzer (Agilent, USA), and Illumina's Library Quanti cation Kit (kappa Bioscience, USA) was used to evaluate the size and number of ampli ed sub-libraries in accordance with the manufacturer's instructions. Samples were sequenced using an Illumina NovaSeq platform in accordance with the manufacturer's instructions, and the sequencing was performed by LC-Bio Technology Co., Ltd., China.
Bioinformatic analysis. Raw sequencing reads were analyzed using QIIME2 software package. Quality ltering of the raw reads was performed using speci c ltering conditions in fqtrim (V.0.9.4) to obtain high-quality clean tags. Sequences with ≥ 100% similarity were assigned to the same feature. The DADA2 software was used to lter the sequencing reads and construct the feature table and sequences. Consequently, 62507.2 (min = 14078, max. = 75609) reads on average, with 60913.1 (min = 14078, max = 73401) in the CCA group and 63626.4 (min = 39010, max = 75609) in the CF group, were included.
Sequence alignment for species annotation was performed using BLAST, and the SILVA and NT-16S alignment databases were used. The dominant species was identi ed from among different groups and multiple sequence alignment was performed using the MAFFT software (V.7.310). The alpha diversity of the samples was determined in accordance with Chao1, species observation, goods_Coverage, and the Shannon and Simpson indices. Beta diversity was determined through principal coordinate analysis (PCoA).
Statistics. Wilcoxon rank-sum test was used to identify signi cant differences in microbial abundance, followed by linear discriminant analysis effect size analysis (LEfSe) (https://huttenhower.sph.harvard.edu/galaxy/) to identify the differently enriched microorganisms. Further analyses were performed using R packages (v3.5.2). LASSO-logistic regression analysis was performed to construct a model that could best distinguish CCA from CF. The receiver operating characteristic (ROC) curve of the model was constructed using the pROC package. The area under the curve (AUC) values for each genus alone and all 12 genera together were determined to evaluate the performance of potential biomarkers. The correlations between clinical variables and microorganisms were analyzed and displayed using a heatmap. The random forest algorithm was used to elucidate the effect of clinical variables and microorganisms on the CCA and CF. A P-value of < 0.05 was considered signi cant.

Baseline characteristics of the participants
To elucidate the changes in the gut microbiota between CCA and CF, 33 CCA patients and 47 cancer-free individuals were recruited, all participants were hepatitis C virus (HCV) negative. The CCA patient group comprised 16 cases of ICC and 17 cases of PCC and DCC. ICC accounts for > 48% in the CCA cohort, consistent with the epidemiological characteristic of cholangiocarcinoma in two decades [1]. Fifteen (45.4%) CCA patients were classi ed as early-stage, including TNM stage I (13, 39.3%) and TNM stage II (2, 6.0%). The median levels of Carbohydrate antigen 199 (CA19-9), carcinoembryonic antigen (CEA), and alpha fetoprotein (AFP) were 237.7, 4.1, and 2.8, respectively. The baseline characteristics of the participants are detailed in Fig. 1 and Supplementary Table S1.

Gut microbiome diversity between CCA and cancer-free patients
We performed 16S rRNA gene sequencing using fecal microbiome samples from the participants. Alpha diversity and beta diversity were crucial indices when analyzing gut microbiota [27][28][29]. CCA patients tended to have markedly higher observed number and Simpson diversity than those of cancer-free individuals (p < 0.001) (Fig. 2a). A detailed analysis indicated that 204 species and 79 genera were present in the CCA group, whereas 248 species and 106 genera were present in the CF group (Supplementary Figure S1). These ndings indicate a change in the gut microbiota composition in accordance with the liver disease status.
PCoA was performed to determine the overall microbial diversity and a signi cant global difference in the microbial composition between the CCA and CF (Unweighted UniFrac p = 0.001) (Fig. 2b). Further strati ed analysis revealed no signi cant differences in the microbiome composition among ICC, PCC, and DCC (Unweighted UniFrac p = 0.958). Moreover, no signi cant difference was observed between early-stage (TNM stage -II) and advanced-stage (TNM stage -) CCA patients (Unweighted UniFrac p = 0.578) (Fig. 2b).

Differential analysis of the taxonomic abundance of the intestinal microbiome
The taxonomic abundance of the intestinal microbiomes of the CCA and CF groups at the phylum, family, and genus levels was examined. The anaerobic phylum Firmicutes was the most abundant phylum in both groups. The genus Bi dobacterium was signi cantly more abundant in the CF group (Fig. 3a). To further investigate which taxa contributed to the observed differences between the intestinal microbiomes of the CCA and CF groups, we preliminarily screened out 98 genera and nine phyla differentially enriched between the CCA and CF groups. Speci cally, six phyla and 70 genera were signi cantly enriched, whereas three phyla, 28 genera, and 15 species displayed decreased abundance in the CCA group compared to the CF cohort (|log 2 FC|>2, p < 0.05) (Supplementary Table S2, S3). In the CCA cohort, phylum Bacteroides was remarkably enriched, whereas the abundance of phylum Actinobacteria was signi cantly reduced (Fig. 3b).
Linear discriminant analysis effect size (LEfSe) was next evaluated on the basis of the fecal microbiota composition from the CCA and CF cohorts. The analysis identi ed 60 relatively differentially abundant bacterial taxa between the two cohorts (LDA score > 3.0, p < 0.05) (Fig. 3c). Speci cally, genus Muribaculaceae_unclassi ed was the most abundant in the CCA group, whereas probiotics including Bi dobacterium and Escherichia_Shigella were most abundant in the cancer-free group, concurrent with the relative abundance of their genera. Thirty-two co-differentially enriched micro ora were present between the CCA and CF cohorts (Supplementary Table S4).
Gut microbiome-based predictive models for CCA diagnosis To construct a gut microbiome-based predictive model to non-invasively diagnose CCA, the glmnet R package was used to develop a LASSO-logistic regression model, wherein the regression coe cient was reduced proportionate to the population sizes of the 32 co-differentially enriched microbiomes, and 12 different genera were identi ed as candidate markers for distinguishing CCA patients from CF individuals (Fig. 4a, Supplementary Table S5). The Circos tool was used to visualize the distribution of the 12 genera in the two groups (Supplementary Figure S2). The 12 genera were separately considered as predictors, and an area under the receiver operating characteristics curves (AUC) was generated for each, with the highest area being 0.84 in Muribaculaceae_unclassi ed (Supplementary Figure S3). Furthermore, we accounted for the potential predictive value of combining the 12 genera, and our analysis revealed that the combination of the 12 genera could best distinguish CCA from CF at an AUC of 0.93 (95%CI 0.86-0.98) (Fig. 4b). Simply, using all 12 genera together could signi cantly improve predictive performance compared to that achieved with any genus alone.
To further validate the results, we developed Random Forests using the genus-level relative abundance data to visualize the in uence of differential genera abundance and clinical variables on the two cohorts. Consequently, the combination of 12 genera had a greater in uence than the clinical variables, whereas AKP was the second most important variable (Fig. 4c).
Correlations among tumor stage, tumor subtype, clinical characteristics, and the gut microbiota composition Cancer classi cation and staging play a pivotal role in therapy and prognosis [30]. Here, we compared the microbial compositions between the different CCA subtype and stage (Fig. 5a). Several CCA-related microbiome characteristics, such as enrichment in the genus Muribaculaceae_unclassi ed, and decline in the genera Escherichia-Shigella and Bi dobacterium, were observed (Fig. 5b).
Furthermore, we compared the abundance of gut microbiome in the CCA subtype and observed that the most signi cantly altered species in ICC samples belonged to the genera Murimonas, Ezakiella, and Garciella, which were signi cantly different from PCC and DCC subgroups (p = 0.0153, p = 0.0421, and p = 0.0421, respectively) ( Supplementary Table S6). Moreover, the genera Ezakiella and Garciella were observed only in PCC and DCC samples. These results provide evidence for the use of a stool test to assist the anatomical classi cation of CCA. The diagnostic value of the aforementioned three genera was examined through ROC curve analysis. Genus Murimonas yielded an AUC of 0.713 (95%CI, 0.65-0.78) to distinguish ICC patients from those with other CCA subtypes (Fig. 5c). Further studies are required to explore the CCA subtype-speci c microbiome.

Functions of the gut microbiota
To assess the metabolic functions of the different microbial communities of the CCA and CF individuals, we used the PICRUSt2 package to predict the metagenome content based on the cluster of orthologous groups (COG) protein database (https://www.ncbi.nlm.nih.gov/COG/). Metagenomes, for instance, antioxidant proteins, pathways including amino acid production, and anaerobic metabolism were downregulated in CCA individuals, whereas ribosomal protein s21 (RPS21) was highly enriched (Fig. 6).

Discussion
Liver-microbiome axis are being increasingly implicated in the maintenance of metabolic homeostasis and pathogenesis of diseases. The intestine is anatomically linked with the liver through blood circulation and systemic innervation [32]. Bile acids (BA) and antimicrobial molecules secreted by the liver are transported to the biliary tract, these molecules are delivered to the intestines, inhibiting intestinal bacterial overgrowth, which is conducive for the maintenance of gut eubiosis. Recent studies have claimed that Bacteroides is associated with BAs, particularly deoxycholic acid (DCA) [33,34]. Furthermore, the intestine hosts numerous symbiotic bacteria and other microorganisms, which produce a various metabolites [35]. These metabolites, such as BA, choline, and short-chain fatty acids (SCFA), serve as important signaling factors and energy substrates and are transported to the liver via the portal vein and eventually in uence liver functions [36]. In recent years, accumulating evidence indicates that distinct alterations in the composition of the intestinal bacteria have been implicated in several metabolic and in ammatory diseases [37]. Transformation of the intestinal ora, resulting in epigenetic changes, in uence the occurrence and development of hepatobiliary disease [38,39]. Low bacterial diversity is considered a major type of gut microbiome dysbiosis and has been observed in numerous diseases [26].
However, the association between the intestinal microbiome and CCA remains unclear. This study elucidates the microbiome composition of the intestinal ora via 16S rRNA gene sequencing, particularly for CCA and CF individuals. Consequently, we observed that the fecal microbiome of CCA patients had higher species richness and homogeneity than that of CF individuals. Our results are in contrast with those of previous studies, since the gut microbiome in CCA individuals warrants extensive evaluation. At the genus level, Bi dobacterium was dominant in CF individuals, whereas Muribaculaceae_unclassi ed was dominant in the CCA patients. Interestingly, members of the genus Bi dobacterium, as bene cial bacteria, regulate hepatic and serum BA pro les primarily by reducing the levels of deoxycholic acid and increasing the levels of chenodeoxycholic acid and ursodeoxycholic acid. Furthermore, high levels of Bi dobacterium in the gut microbiome can suppress pro-in ammatory genes in the liver, achieving anticancer effects [40,41].
Thus far, CA19-9 was the most commonly used tumor marker for CCA, nevertheless, the forecasting performance of CA19-9 remains unsatisfactory(AUC = 0.881) [42], and histological diagnosis is the gold standard for diagnosing CCA. Non-invasive biomarkers for early detection and diagnosis of CCA have been an unmet need [2]. Based on microbiome signatures, we established a microbial-based model for CCA with accurate discriminative potential. Our ndings provide strong evidence regarding the potential of non-invasive fecal testing for early diagnosis of CCA. In this model, several species functionally impact CCA progression. For example, Agathobacter regulates SCFA fermentation and affects intestinal circulation [43]. Similarly, Rothia reduces Lipopolysaccharide (LPS) transportation and liver damage by restoring intestinal barrier function [44]. However, speci c associations between the gut microbiota and CCA pathogenesis warrant further investigation.
Furthermore, three genera, Murimonas, Ezakiella, and Garciella, were signi cantly enriched among ICC patients compared to the PCC and DCC patients. It is worth noting that Murimonas, Ezakiella, and Garciella belong to genus Clostridium, which is a pathogenic bacterium potentially causing gastrointestinal infections [44]. Furthermore, Clostridium is present in the gut microbiome in healthy individuals. however, it becomes harmful when the intestinal microbiota are unbalanced, leading to the production of several toxins, predominantly toxins A (TcdA), toxins B (TcdB), and binary toxin (CDT) [45]. These toxins can cause bloating and diarrhea and are prone to cause disease relapse, potentially causing outbreaks in hospitals and other healthcare facilities. Furthermore, a fraction of patients with recurrent Clostridium di cile-associated diarrhea respond poorly to traditional antibiotic therapy. Numerous studies have reported that Clostridium di cile markedly in uences the incidence of diarrhea, especially antibiotic-associated diarrhea [46,47]. Hence, we collected samples before antibiotic treatment, and it is worth studying the intestinal microbiome of ICC patients prone to diarrhea.
To elucidate the functional classi cation of the microbiome, annotation based on the COG protein database was used. The inferred metagenomes of CCA were characterized by an increase in ribosomal protein s21 (RPS21) and a reduction in antioxidant proteins. RPS21 plays a pivotal role in cell growth, apoptosis, and the promotion of ribosome biogenesis, subsequently promoting tumorigenesis via MAPK activation [48][49][50][51]. Furthermore, Aconitase A-related microbes, which are reduced in CCA patients, reportedly contribute to cellular defense against oxidative stress by regulating targets, including histidinerich protein Hpn, the alkyl hydroperoxide reductase AhpC, and the two-component regulatory protein FlgR [52].
The strength of this study is in the detailed characteristics of CCA and its anatomical subtypes (ICC, PCC, and DCC) elucidated herein and the development of an intestinal bacteria-based model for the diagnosis of CCA in its early stages. Nevertheless, this study also has some limitations. First, this is a single-center study with a limited sample size, hence, a larger multicenter trial should be conducted to validate our ndings. Second, 16S rRNA gene sequencing could not elucidate the entire genome, as would be revealed through metagenome sequencing. Third, we simply considered dietary factors as control covariates by excluding subjects with special dietary habits such as vegetarianism or any other restriction.

Conclusion
Our study provides novel insights into the functional role of the gut microbiome in disease pathogenesis by assessing the gut microbiome of a large cohort of patients newly diagnosed with CCA and indicates the potential of intestinal microorganisms as non-invasive biomarkers for early diagnosis of CCA. The study complies with national standards for ethical, legal, and regulatory requirements, and adheres to the tenets of the 2008 Helsinki Declaration and its amendments. Samples were collected in accordance with medical con dentiality and standard procedures. This study was approved by the Ethics Committee of the First A liate Hospital of Wenzhou Medical University (Ref No. 2020-074), and all subjects provided written informed consent. Figure 1 Clinical characteristics of cholangiocarcinoma (CCA, n=33) and cancer-free (CF, n=47) participants. The chart shows signi cantly different variables between groups. Abbreviation: ALT, alanine aminotransferase, ALB, albumin, TB, total bilirubin, GGT, gamma-glutamyltransferase, AKP, alkaline phosphatase, AST, aspartate aminotransferase, BMI, body mass index.   Figure 3A).

Figures
Red circle and red shadow represent genus signi cant micorbiome in CCA cohort, blue circle and blue shadow represent genus signi cant micorbiome in CF cohort.   Functional gene differences in CCA and CF cohort. The columnar diagram on the left represents the percentage of enrichment in the metabolic pathway in all metabolic pathways within two groups, and the corrected p value was listed on the right.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.