Identification of a brand intratumor microbiome signature for predicting prognosis of hepatocellular carcinoma

Given that prognosis of hepatocellular carcinoma (HCC) differs dramatically, it is imperative to uncover effective and available prognostic biomarker(s). The intratumor microbiome plays a significant role in the response to tumor microenvironment, we aimed to identify an intratumor microbiome signature for predicting the prognosis of HCC patients accurately and investigate its possible mechanisms subsequently. The TCGA HCC microbiome data (TCGA-LIHC-microbiome) was downloaded from cBioPortal. To create an intratumor microbiome-related prognostic signature, univariate and multivariate Cox regression analyses were used to quantify the association of microbial abundance and patients’ overall survival (OS), as well as their diseases specific survival (DSS). The performance of the scoring model was evaluated by the area under the ROC curve (AUC). Based on the microbiome-related signature, clinical factors, and multi-omics molecular subtypes on the basis of “icluster” algorithm, nomograms were established to predict OS and DSS. Patients were further clustered into three subtypes based on their microbiome-related characteristics by consensus clustering. Moreover, deconvolution algorithm, weighted correlation network analysis (WGCNA) and gene set variation analysis (GSVA) were used to investigate the potential mechanisms. In TCGA LIHC microbiome data, the abundances of 166 genera among the total 1406 genera were considerably associated with HCC patients’ OS. From that filtered dataset we identified a 27-microbe prognostic signature and developed a microbiome-related score (MRS) model. Compared with those in the relatively low-risk group, patients in higher-risk group own a much worse OS (P < 0.0001). Besides, the time-dependent ROC curves with MRS showed excellent predictive efficacy both in OS and DSS. Moreover, MRS is an independent prognostic factor for OS and DSS over clinical factors and multi-omics-based molecular subtypes. The integration of MRS into nomograms significantly improved the efficacy of prognosis prediction (1-year AUC:0.849, 3-year AUC: 0.825, 5-year AUC: 0.822). The analysis of microbiome-based subtypes on their immune characteristics and specific gene modules inferred that the intratumor microbiome may affect the HCC patients’ prognosis via modulating the cancer stemness and immune response. MRS, a 27 intratumor microbiome-related prognostic model, was successfully established to predict HCC patients overall survive independently. And the possible underlying mechanisms were also investigated to provide a potential intervention strategy.


Introduction
Great progress in the diagnosis and treatment of human hepatocellular carcinoma (HCC) has been made to ameliorate the patients' suffering, however, HCC still occurs to be the second and fifth estimated leading cause of death in Yisu Song and Ze Xiang and contributed equally to this work and share the first authorship.
Extended author information available on the last page of the article 1 3 China and US Xia et al. 2022). Due to the complexity and heterogeneity of HCC, variate responses to the treatments lead to significant disparate burdens among the patients (Luo et al. 2021). Thus, it is imperative to identify new therapeutic and prognostic biomarkers, as well as more powerful targets for treatments. Basically, the current biomarkers were mainly established on cellular-related profiles, such as genomic and proteomic profiles. Microbiome, one population of special commensal, has arisen remarkable interest over therapeutic strategies in cancer, especially for personal medical therapies. Despite increasing evidence on the importance of gut microbiome in cancer (Helmink et al. 2019) and other gastrointestinal diseases (Hajj Hussein et al. 2023), the clinical effects of the intratumor microbiome with its mechanism have not been fully explored.
There are an increasing number of researches aimed at grasping the significance of the microbiome in HCC development and progression. Gut microbiome and oral microbiome have been identified to be non-invasive diagnostic biomarkers for HCC (Rao et al. 2020). Using mouse models, leaky gut and dysbiosis in the gut microbiota have been interacted with HCC since the subpopulation of gut bacteria could alter the production of certain metabolites or microbiota-associated molecular patterns, and augment the systematic LPS level at different stages of HCC development in mice, which leads to the chaos of immune microenvironment (Dapito et al. 2012;Gäbele et al. 2011;Ma et al. 2018). Interestingly, recent investigations also revealed that both the gut and tissue-resident microbiota are able to promote tumor metastasis (Fu et al. 2022;Li et al. 2019). Moreover, distinct gut microbiome characteristics have been found in HCC tissue in comparison to healthy breast tissue, which may regarded as non-invasive biomarkers for HCC diagnosis and potential targets for HCC prevention (Kang et al. 2022;Ren et al. 2019). Researchers referred that Bacteroides, Lachnospiracea incertae sedis and Clostridium XIVa appear to be enriched in HBV-related HCC patients with a high tumor burden , indicating that these microbes may exert negative effects in these patients. All these findings offer evidence of a significant relationship between the microbiome and the carcinogenesis and progression of HCC.
In addition, investigations also show that the microbiome plays a significant role in the patients' sensitivity to radiotherapy, chemotherapy and immunotherapy for cancer (Al-Qadami et al. 2019;Behary et al. 2021;Chiba et al. 2020). Given that immunotherapy is part of the first-line treatment for HCC, it is worth noting that microbiome has a dual role in tumor immune response. For instance, with germ-free or antibiotic-treated mice, researchers found that fecal transplantation from patients who respond to immune checkpoint blockade could improve the antitumor effect of PD-1 inhibitor (Routy et al. 2018). On the other side, the immunostimulatory role of Bifidobacterium and B fragilis was further proved, which augment the efficacy of anti-PD-L1 and anti-CTLA-4 immunotherapy (Sivan et al. 2015;Vétizou et al. 2015). In total, these results imply that the microbiome is prospective for both diagnosis biomarkers and treatment targets.
The purpose of this study was to identify an intratumor microbiome-related signature for establishing a prognostic scoring system, assessing its clinical impact using TCGA LIHC microbiome data, and attempt to investigate the mechanism beneath this tumor commensal.

Collection of PAC datasets and preprocessing
The Cancer Genome Atlas (TCGA) liver hepatocellular carcinoma microbiome (TCGA-LIHC-microbiome) was downloaded from the cBioPortal (https:// www. cbiop ortal. org/) (Cerami et al. 2012). mRNA, miRNA, copy number variation and DNA methylation statistics of TCGA HCC patients with their clinical information were downloaded from UCSC Xena (http:// xena. ucsc. edu/). Specially, the microbial profile was established from whole-transcriptome sequencing filtering and analyzing of 356 HCC samples (Poore et al. 2020), where a total of 1406 genera were detected and quantified in HCC. In addition, "iClusterPlus" (version 1.26.0) package was used to analyze the molecular subtypes of HCC patients by integrating multi-type genomic data (Mo et al. 2018).

Microbial signature and prognostic scoring model
For the microbiome-related scoring (MRS) model construction, a systematic analysis was conducted step by step in the TCGA-LIHC-microbiome dataset: (1) Univariate Cox regression analysis (survival package in R, Version 3.4-0) was used to identify so-called OS related microbes from the 1406 candidates whose abundance were remarkably related to patients' OS. OS-related microbes were further analyzed via Kaplan-Meier analysis (survminer package in R, Version 0.4.9) and log-rank test (survival package in R, Version 3.4-0), where TCGA-LIHC cohort was divided into high and low abundance groups for each microbe (survminer package in R, Version 0.4.9). Subsequently, (2) A forward conditional multivariate Cox regression (survival package in R, Version 3.4-0) for analysis of OS-related microbes was performed to select a set of independent microbes, regarded as the microbiome-related signature. In this study, we identified a 27-microbe prognostic signature, then (3) we constructed the MRS model by the following formula where the coefficients were obtained from Cox regression analysis of the microbiome prognostic signature:

3
All patients were divided into three subtypes via X-tile software which was applied to select the best cut points. The multivariate Cox regression was used to assess the independent prognostic impact of MAPS by adjusting for the clinical factors (age, stage, AFP level) and icluster-based molecular subtype.

Nomogram
A nomogram model was generated to assist the prediction of 1-, 3-, and 5-year OS and disease-specific survival (DSS) rate of HCC patients (rms package in R, Version 6.4-1). The performance of the nomogram model was evaluated based on the time-dependent receiver-operating characteristic (ROC) curve (survivalROC package in R, Version 1.0.3.1).

Microbial cluster, WGCNA and functional enrichment analysis
Using the "ConsensusClusterPlus" package (version 1.62.0), unsupervised clustering was applied to classify patients into 3 distinct clusters according to their intratumor microbiome abundance, repeated 1000 times to ensure classification stability (Wilkerson and Hayes 2010). Moreover, to identify the key module correlated with the three clusters and construct module-trait relationships, Further, to identify coexpressed gene networks, the WGCNA R package (version 1.72-1) was employed to analyze the clustered microbiomebased subtypes (Langfelder and Horvath 2008). The median absolute deviation (MAD) top 5000 genes was screened for network constructions with a soft thresholding power β = 6 and minModuleSize was set as 30. Subsequently, parameters of hub genes of the specific module were set as gene significance (GS, Pearson's correlation between each gene and clinical trait) > 0.1 and module membership (MM, correlation between each gene and module) > 0.8. Afterward, functional enrichment analysis of module-related hub genes were performed by clusterProfiler R package based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) (version 4.6.1) (Yu et al. 2012). The stemness of each tumor sample was quantified by ssGSEA, and the stemness signatures were collected as previously described (Zheng et al. 2022;Helmink et al. 2019).

Tumor microenvironment (TME) infiltrations exploration and immune response prediction
CIBERSORT, one of the robust deconvolution algorithms which quantifies the relative proportions of 22 immune cells on the basis of normalized bulk sample gene expression profiles (Newman et al. 2015). Additionally, ESTIMATE algorithm could assess immune and stromal cellular infiltrations in tumor samples (Yoshihara et al. 2013). Thus, we quantified the TME fractions of each HCC sample via "CIB-ERSORT" R script with 1,000 permutations and calculate the immune and stromal score through ESTIMATE in R package (version 1.0.13). Tumor Immune Dysfunction and Exclusion (TIDE), a reliable online algorithm (http:// tide. dfci. harva rd. edu/) was used to model tumor immune evasion as a higher TIDE score refers to a poorer immune response and worse immune evasion. Therefore, we employed it to estimate immunotherapeutic responses of each HCC patient (Jiang et al. 2018).

Statistical analysis
All statistical results were analyzed through R (version 4.2.2, https:// www.r-proje ct. org/). Normally distributed variables were analyzed by the student's t-test or Anova test. Nonnormally distributed variables were analyzed via the Wilcoxon rank-sum test or the Kruskal-Wallis test. Survival analysis was performed via the Kaplan-Meier method and the cox proportional hazards model to analyze associations between factors and prognosis by "survival" (version 3.4-0) and "survminer" (version 0.4.9) packages. Graphic visualizations were performed in R (ggplot2 package, version 3.4.1; ggpubr package, version 0.5.0). The difference was defined as statistically significant when adjusted P < 0.05 (two-tailed).

Establishment of microbiome-related model for hepatocellular carcinoma
By applying univariate Cox regression analysis, we identified 166 genera from the total 1046 genera detected in TCGA-LIHC dataset, among which 139 were favorable factors (Hazard ratio (HR) < 1, P < 0.05) and 27 were risk factors (HR > 1, P < 0.05) for patients' OS ( Fig. 1a). Moreover, to evaluate the relationship between microbiome and OS, TCGA-LIHC patients were divided into two generaabundance-based groups via the best cut point optimized by the most significant variance in OS. The specific effects of microbiome abundance on OS were estimated through Kaplan-Meier curves and log-rank test. To name only a few examples, the OS in patients whose tumor owns a higher abundance of Actinotignum and Dolosigranulum is remarkably longer (Fig. 1b, c) while in patients whose tumor owns a high abundance of Francisella and Actinobacillus is significantly shorter (Fig. 1d, e).

Independent value of MRS on prognosis prediction
In TCGA-LIHC microbiome dataset, the 352 patients were divided into high, middle and low groups based on their acquired MRS. Apparently, patients with a higher MRS had a significantly shorter OS (P < 0.0001) according to Kaplan-Meier log-rank test (Fig. 2b). Besides, for 1-, 3-, and 5-year OS, their AUCs were 0.788, 0.756 and 0.700 respectively. DSS was also found influenced by MRS remarkably (Fig. 2c). Patients with a higher MRS owned a rather shorter OS than those with a lower MRS and the AUCs for 1-, 3-, 5-year DSS were 0.748, 0.709 and 0.658 (Fig. 2d, e).
Serum alpha-fetoprotein (AFP) is widely regarded as a promising biomarker for prognostic stratification (Hu et al. 2022), and we also observed that patients with a positive serum AFP level that is more than 20 ng/ml have a significantly shorter OS ( Fig. S1) in this dataset. Previously, we have found that an AFP level equals to 400 ng/ml in HCC patients is a pivotal turning point in the transition of molecular characteristics based on (TMT) technology ) and it has an irreplaceable value on the diagnosis and prediction of patients' prognosis. Therefore, we examined whether MRS has an impact on different serum AFP levels by analyzing OS and DSS for AFP > 400 ng/ml and AFP < 400 ng/ml patients with MRS. Interestingly, higher MRS was related to a poor OS and DSS in AFP < 400 ng/ml group (P < 0.0001, Fig. S2A, C), whereas MRS only take an effect on the OS in AFP > 400 ng/ml group (P < 0.0001, Fig.  S2B) but no effect its DSS (P = 0.095, Fig. S2D), which indicates that MRS has an AFP-independent value on patients' survival.
To implement precision medicine in HCC, the identification of molecular subtypes of HCC has recently been The P values of differences between the two groups are calculated by log-rank test. OS, overall survival; HCC, hepatocellular carcinoma 1 3 researched extensively. For example, Fan Jia group revealed that hepatitis B virus (HBV) related HCC can be classified into three types featured by metabolic reprogramming, microenvironment dysregulation and cell proliferation via integrated multi-omics characterization (Gao et al. 2019;Cerami et al. 2012). As for TCGA-LIHC microbiome datasets, we integrated mRNA, miRNA, DNA methylation and CNV information of each sample and stratified them into three subtypes (Fig. 3a). Among those, OS (P < 0.0001, Fig. 3b), DSS (P = 0.0025, Fig. 3c) and PFS (P = 0.019, Fig. 3d) were significantly different and icluster2 type has the poorest prognosis. Subsequent to the further analysis of the impact of MRS on these three subtypes, it was surprisingly found that a high MRS refers to a remarkably poor OS or DSS consistently in all icluster based subtypes ( Fig. 4a-f). Moreover, in multivariate Cox regression analysis, we concluded the clinical factors such as gender, age, tumor stage and the condition of vascular invasion as well as the mentioned characteristics including serum AFP level and icluter subtypes on patients' OS and DSS. As we expected, MRS is not a protective factor either in OS (HR = 1.9, P < 0.0001, Fig. 5a) or DSS (HR = 1.83, P < 0.0001, Fig. 5b). Above evidence implicated that MRS is able to work on patients' survival independently from not only their clinical factors but also serum and molecular characteristics. By constructing nomogram model with or without MRS (Fig. 6a, c), we assessed its value on prediction of patients' 1-,3-and 5-year survival probability comprehensively, which showed an obvious increase in AUC (without MRS, 1-year AUC: 0.756, 3-year AUC: 0.688, 5-year AUC: 0.681; with MRS, 1-year AUC: 0.849, 3-year AUC: 0.825, 5-year AUC: 0.822).

Potential mechanisms beneath the microbiome in HCC
Given that MRS improved the predictive power of HCC prognosis, it is worth investigating how the intratumor microbiome influence the biological activity of HCC. By an unsupervised cluster method according to the abundance of 27 microbiome, HCC patients in TCGA can be clustered into three subtypes (Fig. 7a). In C1 cluster, patients mainly possess a high abundance of Aquamavirus, Crenobacter, Thalassobius, Aliagarivorans, Robinsoniella, Melissococcus, Shuttleworthia and Methylohalobius. As for the C2 cluster, patients own a high abundance of Olsenella, Candidatus Contendobacter, Amycolatopsis, Alicyclobacillus, Caldimonas, Caballeronia, Pantoea, Acidithrix, Desulfosarcina, Caenimonas and Sediminibacterium. And in the C3 cluster, Dolosigranulum, Rheinheimera, Ornithinimicrobium and Roseivirga are the most increased genera (Fig. 7d). Then, we compared OS among these clusters, which reveals that patients in C1 have the best OS while those in C3 have the poorest one (Fig. 7b). However, MRS between C1 and C3 does not have a significant difference unexpectedly (Fig.  S3). Since there are tremendous studies reporting that microbiome exerts an effect on cancer immune responses, we predicted each patients' immune response based on TIDE. Though it did not show a significantly variant ratio of immune evasion among clusters, there is still a descending trend in the ratio of immune responders (Fig. 7c). Attempting to explain this phenomenon, we calculated the expression of immune-related molecules and the abundance of infiltrative immune cells with a series of deconvolution methods described before. It reveals that only naïve B cells, follicular helper T cells and M1 type macrophages are variants while the abundance of other cellular types such as CD8 T cells remain almost the same (Fig. 7e). And for immune molecules, we defined CD274, CTLA4, PD1, CD160, LAG3, IDO1, HAVCR2 as immune suppressive markers and CXCL10, CXCL9, IFN-γ, TBX2, GZMA, GZMB, PRF1 and CD8a as immune active markers. It uncovers that HCC microbiome-related subtypes have a distinct expression of CD274, CD160, IDO1, HAVCR2, CXCL10, CXCL9 and PRF1 (Fig. 7f). In total, immune activation maybe more powerful than immune suppression in the C1 cluster but C3 cluster may own the opposite status. Interestingly, both in the abundance of immune infiltrative cells and expression of immune-related molecules, C1 and C3 cluster have a higher mean level than the C2 cluster, which is consistent with their MRS but cannot explain their differences in OS completely.
Since C3 cluster patients acquire the least benefit from immunotherapy and the poorest survival, we subsequently employed the WGCNA method to mine crucial gene modules of this subtype in the TCGA-LIHC microbiome dataset. Initially, we set the optimal soft-threshold power at 6 ( Fig. 8a) and the least gene numbers in each module at 30. Then genes with similar expression were clustered into 8 modules (Fig. 8c). Among these 8 modules, the turquoise module showed the strongest positive correlation with the C3 cluster (ME = 0.12, P = 0.02) and the most negative association with the C2 cluster (ME = 0.11, P = 0.05) at the same time (Fig. 8d). Thus, the turquoise module was selected to perform further analysis with the criteria of MM > 0.8 and GS > 0.1 to filter 127 hub genes (Fig. 8e, f). Then we employed KEGG pathway enrichment analysis in the turquoise module and the result revealed that one of the principally enriched pathways was termed "Signaling pathways regulating pluripotency of stem cells" (Fig. 8g), which partly indicated that intratumor microbiome possibly affect the stemness of tumor cells. Hence, we calculated 26 reported stemness gene sets score in each HCC sample by ssGSEA algorithm, in which 9 stemness gene sets were significantly different among the three clusters (Fig. 8h). Compared with their mean scores, the trend is similar to their OS, which may finally elucidate the assumption mentioned above.

Discussion
The microbiota plays an important role in human health and diseases (Xia et al. 2023;Xiang et al. 2023). In recent years, Organs and tissues traditionally regarded as sterile have been found to contain different microbial populations, and these microbial populations play a vital role (Xue et al. 2023).
Many studies have found that the intratumor microbiome matters in the response to the tumor microenvironment, which may influence the prognosis of patients (Huang et al. 2004;Qu et al. 2022). Nevertheless, the relationship between the prognosis of HCC patients and intratumor microbiome has not been deeply discussed. Hence, this study aimed to explore the promising prognostic predictors by investing in the HCC intratumor microbiome signature, which will provide new insights into the prognosis of HCC patients.
Firstly, the microbiome-related model for HCC was successfully constructed. Through the univariate and multivariate Cox analyses, 27 microbes were identified to have an independent effect on the OS of HCC patients. Both prognosis risk and protective microbes were included. Several microbes have been found to play a vital role in cancers. For example, the abundance of Sediminibacterium was found to be important in both gastric cancer and lung cancer (Cheng et al. 2020;Nikitina et al. 2023), and the abundance of Tetragenococcus also matters in oral cancer (Guo et al. 2020). The MRS model for HCC was further established. Based on the acquired MRS, it was found that HCC patients with higher MRS owned shorter OS and DSS than those with lower MRS, which showed that the established MRS model can greatly predict the prognosis of patients with HCC.
It was also demonstrated that AFP stratification will not affect the prognosis prediction ability of the established MRS model although AFP is considered a prognostic biomarker for HCC patients (Johnson et al. 2022). Besides, HCC patients with a high MRS can have poor OS and DSS regardless of icluster based on subtypes. Furthermore, the MRS model was found to be an independent risk factor for   The forest plot illustrates the hazard ratio of clinical factors, pathological factors, molecular subtype and MRS analyzed by multivariate Cox regression for OS and DSS respectively the prognosis of HCC patients. Hence, we can conclude that MRS can effectively predict the OS and DSS of HCC patients independent of not only their clinical factors but also serum and molecular characteristics. So far, a large amount of biomarkers have been identified to predict the response to immunotherapy in HCC Shen et al. 2020). Nejman et al. found that different types of intratumor bacteria are mainly in cancer and immune cells, which further implied that the intratumor microbiome may have a close correlation with tumor immune characteristics (Nejman et al. 2020). Therefore, the microbiome may shed light on the future perspective of biomarkers for HCC immune therapy. By an unsupervised cluster method according to the abundance of 27 microbiome, enrolled HCC patients can be clustered into three subtypes, including C1, C2 and C3. C1 cluster has better OS compared to the C3 cluster although there exists no significant difference in MRS. We further explored the expression of immune-related molecules and the abundance of infiltrative immune cells, and we found that the immune activation in the C1 cluster is more powerful than suppression but the C3 cluster is opposite. However, both C1 and C3 cluster have a higher mean level of immune infiltrative cells and related molecules than the C2 cluster, which is consistent with the trend of MRS. This cannot explain differences in OS. Furthermore, the enriched pathways called "signaling pathways regulating pluripotency of stem cells" was found, and we explored the relationship between the stemness of tumor cells and intratumor microbiome. 9 stemness gene sets were found to significantly affect the intratumor microbiome. Through comparing their mean score, we demonstrated that it was consistent with OS. Hence, the above findings further showed that there exists a close correlation between the stemness of tumor cells and the intratumor microbiome.   h Ridge plot of 9 gene sets concerning cell stemness with significant difference calculated by Kruskal-Wallis test

Conclusion
A 27 intratumor microbiome prognostic signature named MRS was established successfully as a distinct model for predicting the prognosis of HCC patients. Moreover, MRS exerted an independent effect on OS and DSS on clinical factors, pathological factors and multi-omics-based molecular subtypes. Besides, we investigate potential mechanism of the intratumor microbiome and it implicated that intratumor microbiome mainly influence the prognosis of HCC via modulating cancer stemness. Furthermore, interventions targeting the microbiome raise the possibility for HCC patients to have a better prognosis.