Dysbiotic microbiota to complement TNM staging or HPV status for prognostic indication of head and neck cancer

Background: Microbiota has been found associated with the incidence of head and neck cancers (HNCs), however, the association of microbiota with the prognosis of HNCs remains unknown. In the present study, the relationship between tumoral microbiata and survival was examined. The prognosis predictive value of tumoral microbiota to complement classic TNM staging and Human Papillomavirus (HPV) status was also investigated. Results: We conducted a retrospective study including 158 primary tumors using 16S rRNA sequencing. The tumoral microbiota in the HNC patients with poor prognosis were signicantly different from that of the patients with good prognosis. A greater abundance of Schlegelella or Methyloversatilis was characterized in HNC patients with poor prognosis and a greater abundance of Lactobacillus or Bacillus was characterized in HNC patients with good prognosis. Strikingly, the microbial dysbiosis index (MDI), the ratio of above poor and good prognosis associated genera, was strongly associated with overall survival [hazard ratio (HR) 3.24; 95% condence interval (CI), 1.88 to 5.59; P < 0.001] and disease free survival (HR 2.11; 95% CI, 1.34 to 3.34; P = 0.001), independently of age, TNM staging, differentiation and HPV status. Intriguingly, the combination of the MDI and HPV status signicantly improved the prognostic performance of HPV status (AUC of 0.719 and 0.581 respectively, P = 0.024). Moreover, the MDI showed higher risk for overall survival in the late stage patients (HR 3.35; 95% CI, 1.73 to 6.49; P < 0.001) compared with the early stage patients (HR 2.41; 95% CI, 1.00 to 5.84; P = 0.051). Conclusion: Together, the MDI could serve as an applicable indicator of the prognosis of HNCs, and complements the predictive values of TNM staging and HPV status. Moreover, the ndings of the higher risk of the MDI in the


Background
There are more than 834,000 new cases of head and neck cancers (HNCs) resulting in over 431,000 deaths worldwide per year [1]. Despite the development and application of improved surgical techniques, the overall ve-year survival rate of HNC patients, approximately 50%, has not well been improved [2].
There is a critical and urgent need to develop effective approaches for prognostic indication to complement the efforts of traditional TNM staging, so as to stratify for precise and multi-disciplinary treatment strategies and improve the survival rate.
The oral cavity hosts a diverse community of bacteria, referred to as the oral microbiota, with more than 750 distinct bacterial taxa identi ed to date [3]. When the integrity of oral mucosal barrier is lost, the bacteria colonized on its surface subsequently translocate to the underlying tissues. The diversity or the composition of microbiota has been discovered to be involved in the initiation, the progression and the prognosis of various cancers, not only in gut or other cavities but also within sterile tissues [4][5][6][7][8][9][10][11][12]. For HNCs, there have been several studies reporting on oral microbiota [13][14][15][16][17][18][19] and quite limited studies reporting on tumoral microbiota [20][21][22][23][24][25], and their associations with the incidence. However, as to the association of the microbiota with the prognosis of HNCs, no investigation has been reported.
Human papillomavirus (HPV) is an etiological factor for HNCs [26]. Patients with HPV positive HNCs generally have a better prognosis than patients with HPV negative HNCs [27][28][29]. However, the understanding of the distinct prognoses between HPV positive and negative HNCs is limited. It has been demonstrated that vaginal microbiota is involved in HPV acquisition and persistence and the development of cervical carcinoma [30]. However, the association of HPV status with the microbiota of HNC has not been well studied. Importantly, the complement efforts of microbiota for HPV status have not been addressed.
To gain better understanding into the association of the tumoral microbiome with overall survival and the early relapse of HNCs, we conducted a retrospective study on a followed-up cohort. Tumoral microbiota from 158 OCC and OPC patients were assessed using high-throughput sequencing of the 16S ribosomal RNA (16S rRNA) gene. The associations of the indices representing the microbial pro le and the differentially abundant taxa with overall or disease free survival were analyzed. The complementary values of the microbial signatures for classical TNM staging HPV (p16) status in prognostic indication of HNCs were further determined. To understand the association of microbial signatures with prognosis of HNCs, their correlation with clinical-pathological characteristics as well as in ammatory in ltrations were analyzed.

Methods Patients
A total of 158 patients admitted to Qilu Hospital of Shandong University from January 2006 to March 2015 and diagnosed with carcinoma of oral cavity (body of tongue, oor of mouth, gums, hard palate or buccal mucosa) or pharynx (root of tongue and soft palate) were included. All patients received surgical resections and were free of tumor for all margins pathologically. All patients had no history of malignance or treatments. The median age at diagnosis was 57 years old (range 17 to 80 yrs). 93 males and 65 females were included. The majority of such tumors were squamous cell carcinoma (SCC). The baselines of the clinic-pathological characterstics are shown in Additional File 1: Table S1. Patients were followed up until death or until May 2019 (at least 50 months for alive patients, and median 61 months for all patients).

DNA extraction and 16S rRNA Sequencing
Bacterial genomic DNA was extracted from histology slides of tumors using TIANamp FFPE DNA Kit. The 16 s rRNA was ampli ed using primers (515F, 5'-GTGCCAGCMGCCGCGGTAA-3', and 806R, 5'-GGACTACHVGGGTWTCTAA T-3') targeting V4 hypervariable regions with Ion Plus Fragment Library Kit 48 rxns (Thermo sher). The amplicon libraries were conducted by incorporating adapters and samplespeci c barcodes. The PCR products were puri ed using GeneJET (Thermo Scienti c) following electrophoresis and were sequenced using Ion S5TMXL (Thermo sher).

Sequencing and Statistical Analysis of Microbiome Data
Poor quality sequences were excluded using Cutadapt V1.9.1 to obtain raw reads. Chimeric sequences in raw reads were excluded by VSearch V2.9.0 to obtain clean reads. Filtered sequences (clean reads) were clustered into operational taxonomic units (OTUs) assuming 97% identity by Uparse V7.0.1001, and subsequently assigned to taxa according to the reference sequences of SILVA132 database. Diversity analyses were performed using QIIME (V.1.9.1). Alpha diversity (diversity within sample) was determined by observed species and Shannon index. Beta diversity (diversity between samples) was assessed by unweighted (qualitative) and weighted (quantitative) UniFrac distances and was ploted by principal coordinates analysis (PCoA). The differences in diversities were assessed by Wilcoxon test, and the similarities between groups were discriminated by permutational multivariate analysis of variance (PERMANOVA). Taxonomic differences between groups were identi ed using linear discriminant analysis (LDA) effect size (LEfSe). Taxa with LDA greater than 4.0 at a P value < 0.05 were considered signi cantly enriched. Time dependent receiver operating characteristic curve (ROC) analyses were used for the evaluation of prognosis estimating models as well as the strati cation of continuous variables. Continuous variables were de ned as category variables according to the optimal cutoff values of Youden's index of three-year ROC analyses (Additional File 3: Figure S1), except for age and tumor size categorized according to the median value, 58 years or 2.2 cm. The correlation of the observed species with differentially abundant taxa, and their correlations with the clinic-pathological characteristics were analyzed by Spearman's rank test. The prognostic values for estimating overall or disease free survival were evaluated using univariate and multivariate Cox regression. Statistical analyses and the graphs including pie charts, columns, heatmaps were carried out using R statistical software. Correlation network of the taxa at genus level was performed using Graphviz software.

Immunohistochemistry Of P16
Depara nized tissue sections were incubated with monoclonal antibody against human p16 (BD Biosciences, CA, USA) at 1:50 dilution at 4 °C overnight, followed by a biotin-streptavidin horseradish peroxidase detection system (SP-9000, SPlink Detection Kit, ZSGB-bio, Beijing, China) following the manufacture's instructions. The expression of p16 was quanti ed according to staining intensity (on a scale of 0-3) and positivity (0, 5, 25, 50 or 75%) of the strongest signal of each core in tumor cells.

Klintrup-Mäkinen score
Overall in ammatory in ltration (lymphoid cells, neutrophilic and eosinophilic granulocytes) was assessed according to the Klintrup-Mäkinen (K-M) score examined in haematoxylin and eosin stained sections using a four-degree scale as previously described [31,32]. Brie y, 0, 1, 2 and 3 indicated none, poorly, moderately and highly in ammatory in ltration respectively. Results 1. Pro le and composition of the tumoral microbiota differ in poor prognosis patients and good prognosis patients The microbiota of 158 tumors from 158 HNC patients were analyzed by next generation sequencing of the 16S rRNA gene. An average of 75730 clean reads per sample and a total of 19484 OTUs in all samples were obtained. To evaluate the pro le of tumoal microbiota between distinct prognoses, we measured the observed species and Shannon index representing microbial α-diversity and unweighted and weighted UniFrac distances representing microbial β-diversity between the patients with overall survival (OS ≤ 3 yrs or > 3 yrs), or between the patients with or without early relapse (relapse ≤ 2 yrs or > 2 yrs). As shown in Fig. 1a, the observed species in patients with poor overall survival (OS ≤ 3 yrs) were lower than that in the patients with overall survival longer than three years (P = 0.041). Other indices of αdiversity were shown in Additional File 2: Table S2. Furthermore, unweighted UniFrac distances amongst the patients with early or late relapse exhibited signi cant difference (P = 0.030) (Fig. 1g) To identify the most relevant taxa responsible for the differences between distinct prognoses, LEfSe was conducted. In total, 16 taxa, including four genera, were characterized differentially abundant in patients with distinct prognoses. The phylum Firmicutes, the class Bacilli, the order Bacilllales, Lactobacillales and Bacterioidales, the family Bacillaceae and Lactobacillaceae and the genera Bacillus and Lactobacillus were enriched in patients with longer OS or late relapse. By contrast, the phylum Proteobacteria, the class Gammaproteobacteria, the order unidenti ed Gammaproteobacteria, the families Rhodocyclaceae and Burkholderiaceae and the genera Methyloversatilis and Schlegelella were enriched in patients with poor overall survival or early relapse (Fig. 1d-e and Fig. 1i-j). This is a novel nding reveals that not only the structure but also the composition of the tumoral microbiota between HNC patients with distinct prognoses are different for the rst time. However, questions, such as how the microbial structure and the composition interact, whether these microbial signatures associate with the patient outcomes, and whether they could complement the prognostic values of TNM staging or HPV status, address the following analyses.

Poor prognosis patients lost the balance of tumoral microbiota
To reveal the interaction of the pro le and the composition of the tumoral microbiota between distinct prognoses of HNCs, we reconstructed the correlation networks of the relative abundances of top 100 genera for poor prognosis (OS ≤ 3 yrs) and good prognosis (OS > 3yrs) respectively. The good prognosis network showed that Bacillus was positively correlated with most the other genera of Firmicutes, Bacteroidetes and Proteobacteria, and negatively correlated with both Methyloversatilis and Schlegelella (Fig. 2b). However, in the poor prognosis network, Bacillus lost the most correlations and only retained a positive correlation with Bacteroides, on the other hand, both Methyloversatilis and Schlegelella lost most of the negative correlations, but retained their positive interactions within Proteobacteria (Fig. 2a).
The correlation of the observed species representing the richness of the microbiota and the 16 differentially abundant taxa was further analyzed. As shown in Fig. 3a, the poor prognosis differential taxa were positively correlated with each other within phylum Proteobacteria, while the good prognosis differential taxa were positively correlated with each other within phylum Firmicutes or Bacteroidetes. The greater abundance of the poor prognosis associated taxa was correlated with a lower observed species, whereas, the greater abundance of the good prognosis associated taxa was correlated with a higher observed species (Fig. 3a). On the other hand, the greater abundance of Bacillus and/or Lactobacillus contributed 82.75% to a good richness of microbiota, whereas, the greater abundance of Methyloversatilis and/or Schlegelella contributed to 59.52% of poor richness of microbiota (Fig. 3b).
Considering the interaction of the differential taxa, we used the ratio of total abundances of Shlegelella and Methyloversatilis and total abundances of Bacillus and Lactobacillus, and de ned as microbial dysbiosis index (MDI) representing the imbalance of the microbiota between distinct prognoses [33]. The MDI demonstrated signi cant differences between the patients with OS ≤ 3 yrs and the patients with OS > 3 yrs as expected (P < 0.001, Fig. 3c). Notably, the MDI showed an inverse correlation with the observed species (r = − 0.232, P = 0.004) and a direct correlation with the weighted UniFrac Distances (r = 0.157, P = 0.001) as shown in Fig. 3d-f.
Together, the results suggested that a greater abundance of Bacillus and Lactobacillus promoted a richer microbiota. On the other hand, a greater abundance of Methyloversatilis and Schlegelella impeded the richness of the microbiota which might result in an imbalance. These also indicated that the MDI, the ratio of above taxa, could not only represent the compositional imbalance, but also re ect the structural imbalance to a certain extent.

Microbial dysbiosis index is associated with HNC prognosis independently
Next, we seek to investigate the association of the microbial dysbiosis index and the prognosis of HNC. Firstly, we evaluated discriminating value of the MDI in prognosis indication of HNCs using time dependent receiver operating characteristics (ROC) analysis. The MDI exhibited an excellent performance in estimating overall survival with an AUC of 0.708 for three-year survival estimation (Fig. 4a), and was much improved compared to the use of observed species or single taxa (Additional File 3: Figure S1). In addition, a higher cutoff value applied for the MDI presented an even better discrimination for one-year overall survival (AUC of 0.769, Fig. 4a).
Secondly, we performed Kaplan Meier analyses to investigate the association of the MDI with survival of HNCs. Median overall survival of all the 158 patients in this cohort was 61 months. The median overall survival of the patients with the high MDI (46 months with 95% CI of 33 months to not reached) was higher compared with that of the patients with the low MDI (not reached, log-rank P < 0.001, Fig. 4b).
Consistently, median disease free survival of the patients with the high MDI patients (43 months with 95% CI of 35 to 83 months) was higher compared with that of the patients with the low MDI (not reached, logrank P < 0.001, Fig. 4c).
Thirdly, we performed Cox regression analyses of the microbial and the clinic-pathological characteristics before multivariate Cox regression. The MDI and nearly all differentially abundant taxa (except for Bacillus), but not the observed species, were associated with overall survival. Lymph node metastasis, TNM staging and p16 status were associated with overall survival, age and differentiation were also associated with disease free survival in univariate Cox regression (Additional File 3: Figure S2). In multivariate Cox regression, the MDI was a prognostic indicator of both overall survival and disease free survival characteristics independent of age, TNM staging, differentiation and p16 status (HR 3.24; 95% CI, 1.88 to 5.59; P < 0.001 and HR 2.11; 95% CI, 1.34 to 3.34; P = 0.001, respectively; Fig. 4d). We further evaluated discriminating value of the risk score of above multivariate Cox model (a factor considering the effects of MDI, age, TNM staging, differentiation and p16 status) in prognosis indication of HNCs using ROC analysis. The risk score exhibited an improved performance in estimating overall survival (AUC of 0.831 for one-year survival and 0.793 for three-year survival; Fig. 4e).
In addition, we reviewed 12 articles describing differentially enriched taxa in oral microbiota and tissue microbiota between HNCs and controls (Additional File2: Table S3). Although the taxa differentially enriched in the cases and controls (or tumor tissue and adjacent/contradictory controls) were discriminating, we analyzed all above taxa succeeded to be annotated in our study and identi ed the genera Bacteroides, Enterococcus (enriched in oral cavity of the cases), Rothia and Veillonella (enriched in the tissue of the cases) to be associated with good overall survival (Additional File 3: Figure S3). These results indicated that the risking taxa of incidence might not associate with poor prognosis, or vice verse.
Collectively, these data proved that the microbial dysbiosis index exhibited an excellent performance for survival prediction compared to the use of single taxon including the differentially enriched taxa and the HNC incidence associated taxa. The MDI showed a strong and an independent association with the overall survival of HNCs comparable to the TNM staging. The risk score combining the effects of the MDI, age, TNM staging, differentiation and p16 status, is an excellent prognostic indicator of HNCs.
4. Microbial dysbiosis index complement TNM staging or HPV status in prognostic prediction of HNCs TNM staging and HPV status have been generally accepted as prognositic indicators of HNCs, therefore, we performed following analyses to investigate whether MDI could complement TNM staging and HPV status. Firstly, we compared the MDI between early stage and late stage patients (scatters in Fig. 5a), as well as between HPV negative and HPV positive patients (scatters in Fig. 5b). Although no differences were observed, higher MDI distributions were shown in early stage patients and HPV positive patients. Cochrane-Mantel-Haenszel tests demonstrated that the high MDI was associated with poor overall survival (OS ≤ 3 yrs) in both early TNM patients and late TNM patients or both p16 negative patients and p16 positive patients (pies in Fig. 5a and 5b).
Secondly, we conducted strati ed Kaplan-Meier analyses and subgroup Cox regressions. The late stage patients with the high MDI exhibited signi cant poor overall survival (median of 30.5 months, log-rank P < 0.001, Fig. 5c). Interestingly, the survivals of the high MDI patients and the low MDI patients were signi cantly different in the late stage patients (P < 0.001), but the difference in the early stage patients was not signi cant (P = 0.066). On the other hand, the survivals of the early stage patients and the late stage patients were signi cantly different in the patients with high MDI (P = 0.028), but the difference in the patients with low MDI was not signi cant (P = 0.107). In terms of p16 status strati cation, the survivals of high MDI patients and low MDI patients were signi cantly different in both the p16 positive patients (P = 0.010) and the p16 negative patients (P < 0.001). However, no signi cant differences were observed between the p16 positive patients and the p16 negative patients in the patients with high MDI (P = 0.100) or in the patients with low MDI (P = 0.053). Consistently, in the subgroup Cox regression analysis, the MDI showed higher risk in the late stage patients (HR 3.35; 95% CI, 1.73 to 6.49; P < 0.001) compared with the early stage patients (HR 2.41; 95% CI, 1.00 to 5.84; P = 0.051) (Fig. 5e). And the MDI showed both high risks in the p16 positive patients (HR 4.77; 95% CI, 1.43 to 15.88; P = 0.011) and the p16 negative patients (HR 3.37; 95% CI, 1.88 to 6.03; P < 0.001). These results indicated that the MDI show more distinctive efforts in the late stage patients. Furthermore, these results also suggested that the MDI might showed more association with the overall survival compared to p16 status.
Thirdly, we evaluated discriminating value of the MDI in three-year overall survival prediction of HNCs using ROC analysis ( Fig. 5f-5 g). The MDI showed signi cant higher AUC compared to p16 status (AUC of 0.708 and 0.581 respectively, P = 0.050), but not compared to TNM staging (AUC of 0.708 and 0.644 respectively, P = 0.362). Consistently, p16 status together with the MDI exhibited an improved performance compared to p16 status only (AUC of 0.719 and 0.581 respectively, P = 0.024). However, the improvement of TNM staging together with the MDI was not signi cant compared to TNM staging only (AUC of 0.730 and 0.644 respectively, P = 0.171). TNM staging, p16 status together with the MDI exhibited an excellent performance with an AUC of 0.736 and a sensitivity of 0.971. These results suggested that the MDI could complement the prognostic indication of HNCs for p16 status to a considerable extent, but a limited extent for TNM staging.

Tumor behavior and in ammatory in ltration modulation might contribute to the association of microbiota and the prognosis of HNCs
To seek to understand how the microbiota in uences the outcomes of HNCs, we analyze the correlation of the microbial signature and the clinic-pathological characteristics. We found that the greater abundances of Gammaproteobacteria and Shlegelella were correlated with elder age, and a greater abundance of Proteobacteria was correlated with lower differentiation, whereas, Lactobacillales was correlated with smaller tumor size (Fig. 6a). Furthermore, signi cant microbial differences were observed in the patients with differing age, lymph node metastasis or differentiation, but not in the patients with differing gender, smoking or drinking status, tumor site, histological type or tumor size (Additional File 3: Figure S4-5). Essentially, the differential taxa enriched in poor prognosis, elder age, positive lymph nodes or late stage belong to the phylum Proteobacteria, by contrast, the taxa differentially enriched in good prognosis, negative lymph nodes or early stage belong to the phylum Firmicutes (Additional File 3: Figure  S6). These ndings suggested the microbiota might in uence the patient outcomes through affecting the tumor behaviors, such as lymph node metastasis, potentially.
To investigate whether the bacteria affect the outcomes of HNCs through tumor microenvironment modulation, we analyze the correlation of the microbial signature with the overall in ammatory in ltration indicated by Klintrup-Mäkinen (K-M) score [31,32]. Greater abundances of Methyloversatilis and/or Schlegelella, and less abundances Bacillus and/or Lactobacillus were distributed in none in ammatory in ltrated patients (Fig. 6b). No differences were found between the K-M score and the abundances of above differentially abundant taxa (P = 0.109 for Lactobacillus, Fig. 6c-f). However, higher Lactobacillus was observed in highly in ammatory in ltration compared to none in ammatory in ltration (P = 0.042, Fig. 6d). As illustrated in the Alluvial plots in Fig. 6g-6j, the in ammatory in ltration did not affect the association of the differentially abundant taxa with the overall survival, except for Lactobacillus. These results suggested that Lactobacillus promoted the in ammatory in ltration to a certain extent, which might contribute to its association with the outcomes of HNCs. However, the other differentially abundant taxa, especially the risk taxa, Methyloversatilis and Schlegelella, might not affect the outcomes of HNCs through the modulation of in ammatory in ltration.

Discussion
In this study, by pro ling 16S rRNA genes using nextgeneration sequencing, we explored the tumoral microbiota associated with poor prognosis and good prognosis in large-scale and in-depth analyses. For the rst time, we revealed the structural differences of tumoral microbiota in poor prognosis patients and good prognosis patients, with a reduced richness in poor prognosis patients. This is consistent with the ndings of the gut microbiota in the recipients of allogeneic hematopoietic stem cell transplantation [6]. In composition, we identi ed nine good prognosis associated taxa and seven poor prognosis associated taxa. Gammaproteobacteria, which is negatively associated with the prognosis of pancreatic ductal adenocarcinoma [7], was also found associated with the prognosis of HNCs in our study. However, Fusobacterium, reported to be associated with periodontitis and pancreatic cancer [10,34], showed no differential enrichment in HNC patients with distinct overall survival or early relapse.
The poor prognosis associated taxa, Schlegelella, has recently been identi ed in dentin caries lesions due to the advances of the next generation sequencing [35,36]. This suggested a potential connection of the microbial translocation post mucosal integrity loss and the HNCs related with residual root or crown resulting from caries. Intriguingly, Methyloversatilis broadly distributes in the natural environment with the capabilities of methylotroph of methanol or methylamine as well as utilization of other organic acids such as alcohols and aromatic compounds [37,38]. We have known that the Warburg effect of cancer cells needs large amount of carbon sources. Therefore, further studies investigating the involvement of the metabolites from Methyloversatilis in the growth and the metabolism of cancer cells may explain its association with prognosis and would be a stepping-stone for the understanding of HNCs. In accordance with previous studies of vaginal microbiota, Lactobacillus, asscociated with the early regressions of cervical intraepithelial neoplasia [39], was found associated with good prognosis of HNCs in our study. Together with the essential role of Lactobacillus in the balance of microbiota, the ndings suggested that Lactobacillus depletion and microbial dysbiosis by antibiotic abuse in clinic might associate with poor prognosis.
We integrated the above genera most relevant to the prognosis of HNCs to calculate the microbial dysbiosis index. It not only re ected the compositional imbalance of the microbiota between poor prognosis and good prognosis, but also showed high correlations with the diversities of the microbiota. The MDI exhibited an improved capacity to discriminate poor prognosis compared to the use of single differentially abundant taxa. This suggests that the microbiota as a whole rather than individual taxa, including differentially enriched taxa as well as the HNC incidence associated taxa, played more important role in the prognosis of HNCs. Notably, risk factors for cancer initiation may not associated with cancer development and outcome, such as Helicobacter Pylori, the risk factor for gastric cancers, has been proven not to affect the prognosis or a bene cial prognostic indicator in different studies [40][41][42]. Our results suggested the HNC incidence risk taxa might not associate with poor prognoses, or vice verse. Therefore, the need to evaluate microbiological prognostic indicators, other than incidence risk indicators, is much essential. In our study, the risk scores combining the MDI, age, TNM staging, differentiation and p16 status demonstrated an excellent performance with an AUC nearly 80%.
Classical TNM staging is still the most important strategy for prognostic evaluation of HNCs in clinic. HPV positive patients have also been generally accepted to have a good prognosis. In our study, we found the MDI showed a comparable hazard ratio with the TNM staging. The late stage patients with the high MDI exhibited extremely reduced overall survival, by contrast, the late stage patients with the low MDI exhibited comparable overall survivals with the early stage patients. These ndings not only supported the complementary value of the MDI for prognostic indication, but also suggested that microbial modulation might improve the overall survival of the late stage HNC patients potentially. In terms of HPV status, our results demonstrated a much stronger association of the MDI with the prognosis of HNCs compared to p16 status. The MDI could signi cantly improve the performance of p16 status in three-year survival indication of HNCs. These implicate more attention to be given to HPV positive HNCs with a high MDI microbiota for therapy or follow up strategies. On the other hand, vaginal microbiota has been proven to in uence HPV acquisition and persistence and the development of cervical cancer [43]. Together with the microbioal discriminations between HPV positive and HPV negative patients of HNCs in our study and other reports [14], these suggested that the distinct prognoses between HPV positive and negative HNCs may possibly result from the contribution of microbiota and provide directions for further studies.
Next, we found correlations of the differentially abundant taxa with the clinic-pathological characteristics.
On the other hand, we found overlaps of the differentially abundant taxa in distinct prognosis and the clinic-pathological characteristics. These ndings suggested that the microbiota might affect the clinicpathological characteristics, such as tumor behaviors, leading to distinction of prognoses. We also took the tumor microenvironment into consideration. However, no correlations of the microbial signatures with the overall in ammatory in ltration based on Klintrup-Mäkinen (K-M) score were found, except for Lactobacillus. These results suggested that the in ammatory in ltration modulated by Lactobacillus might be involved in the progression of HNCs to a certain extent.

Conclusion
In this study, we have demonstrated that the structure and the composition of the tumor microbiota in the HNC patients with poor prognosis are signi cantly different from that of the patients with good prognosis. Microbial dysbiosis index, the ratio representing the compositional imbalance of the whole microbiota, has shown strong association with overall and disease free survival independently of age, TNM staging, differentiation and HPV status. It could signi cantly improve the predictive performance of HPV status, indicating the important role of tumoral microbiota in the distinct prognoses between HPV positive patients and HPV negative patients. Unexpectly, we found that the MDI severely affected the overall survival of the late stage patients. This suggests that tumoral microbiota might not only be used as a prognostic indicator, but also its modulation might be valuable in survival improvement of the late stage patients.

Declarations
Ethics approval and consent to participate The study was approved from Ethics Committee of Qilu Hospital of Shandong University.

Consent for publication
Not applicable.

Availability of data and materials
The sequencing data that support the results of this study have been deposited in "NCBI" with the BioProject accession code "PRJNA624194" (https://www.ncbi.nlm.nih.gov/sra/PRJNA624194).

Competing Interests
The authors declare no competing interests.

Figure 2
Correlation networks of the genera in patients with distinct overall survival a Network in the patients with OS ≤ 3Y. b Network in the patients with OS > 3Y. Each node represents a genus, and the size is proportional to the mean relative abundance. The edges represent the correlation analyzed using Spearman's rank test. Red edges represent positive correlation, and blue edges represent negative correlation. Only the nodes and the edges with correlation coe ence > 0.6, P < 0.05 and relative abundance > 0.005% are shown.

Figure 3
Correlation of differentially abundant taxa, microbial dysbiosis index and microbial diversities a A heatmap illustrates the correlation of observed species and the relative abundances of the differentially  Univariate Cox regression of observed species, differentially abundant taxa and risk taxa ratio, as well as clinic-pathological characteristics are shown in Additional File 3: Figure S2. e Time dependent ROC analysis evaluates the discriminatory potential of risk score from multivariate Cox regression of overall survival in prognostic indication of head and neck cancer.  indicates P < 0.01, *** indicates P <0.001e A forest plot shows the hazard ratios of MDI in TNM stage and p16 status subgroups for overall survivals. f-g ROC analysis evaluates the discriminatory potential of the MDI, TNM stage, p16 status, MDI together with TNM stage, MDI together with p16 status, as well as MDI together with TNM stage and p16status in three-year overall survival prediction of head and neck cancer. P16 negative is considered as "1" in this ROC analysis. Supplementary Files