1. Profile and composition of the tumoral microbiota differ in poor prognosis patients and good prognosis patients
The microbiota of 158 tumors from 158 HNC patients were analyzed by next generation sequencing of the 16S rRNA gene. An average of 75730 clean reads per sample and a total of 19484 OTUs in all samples were obtained. To evaluate the profile of tumoal microbiota between distinct prognoses, we measured the observed species and Shannon index representing microbial α-diversity and unweighted and weighted UniFrac distances representing microbial β-diversity between the patients with overall survival (OS ≤ 3 yrs or > 3 yrs), or between the patients with or without early relapse (relapse ≤ 2 yrs or > 2 yrs). As shown in Fig. 1a, the observed species in patients with poor overall survival (OS ≤ 3 yrs) were lower than that in the patients with overall survival longer than three years (P = 0.041). Other indices of α-diversity were shown in Additional File 2: Table S2. Furthermore, unweighted UniFrac distances amongst the patients with early or late relapse exhibited significant difference (P = 0.030) (Fig. 1g). PERMANOVA indicated significant differences of compositions of the tumoral microbiota based on both unweighted and weighted UniFrac distances (Fig. 1b-c and Fig. 1g-h).
To identify the most relevant taxa responsible for the differences between distinct prognoses, LEfSe was conducted. In total, 16 taxa, including four genera, were characterized differentially abundant in patients with distinct prognoses. The phylum Firmicutes, the class Bacilli, the order Bacilllales, Lactobacillales and Bacterioidales, the family Bacillaceae and Lactobacillaceae and the genera Bacillus and Lactobacillus were enriched in patients with longer OS or late relapse. By contrast, the phylum Proteobacteria, the class Gammaproteobacteria, the order unidentified Gammaproteobacteria, the families Rhodocyclaceae and Burkholderiaceae and the genera Methyloversatilis and Schlegelella were enriched in patients with poor overall survival or early relapse (Fig. 1d-e and Fig. 1i-j).
This is a novel finding reveals that not only the structure but also the composition of the tumoral microbiota between HNC patients with distinct prognoses are different for the first time. However, questions, such as how the microbial structure and the composition interact, whether these microbial signatures associate with the patient outcomes, and whether they could complement the prognostic values of TNM staging or HPV status, address the following analyses.
2. Poor prognosis patients lost the balance of tumoral microbiota
To reveal the interaction of the profile and the composition of the tumoral microbiota between distinct prognoses of HNCs, we reconstructed the correlation networks of the relative abundances of top 100 genera for poor prognosis (OS ≤ 3 yrs) and good prognosis (OS > 3yrs) respectively. The good prognosis network showed that Bacillus was positively correlated with most the other genera of Firmicutes, Bacteroidetes and Proteobacteria, and negatively correlated with both Methyloversatilis and Schlegelella (Fig. 2b). However, in the poor prognosis network, Bacillus lost the most correlations and only retained a positive correlation with Bacteroides, on the other hand, both Methyloversatilis and Schlegelella lost most of the negative correlations, but retained their positive interactions within Proteobacteria (Fig. 2a).
The correlation of the observed species representing the richness of the microbiota and the 16 differentially abundant taxa was further analyzed. As shown in Fig. 3a, the poor prognosis differential taxa were positively correlated with each other within phylum Proteobacteria, while the good prognosis differential taxa were positively correlated with each other within phylum Firmicutes or Bacteroidetes. The greater abundance of the poor prognosis associated taxa was correlated with a lower observed species, whereas, the greater abundance of the good prognosis associated taxa was correlated with a higher observed species (Fig. 3a). On the other hand, the greater abundance of Bacillus and/or Lactobacillus contributed 82.75% to a good richness of microbiota, whereas, the greater abundance of Methyloversatilis and/or Schlegelella contributed to 59.52% of poor richness of microbiota (Fig. 3b).
Considering the interaction of the differential taxa, we used the ratio of total abundances of Shlegelella and Methyloversatilis and total abundances of Bacillus and Lactobacillus, and defined as microbial dysbiosis index (MDI) representing the imbalance of the microbiota between distinct prognoses [33]. The MDI demonstrated significant differences between the patients with OS ≤ 3 yrs and the patients with OS > 3 yrs as expected (P < 0.001, Fig. 3c). Notably, the MDI showed an inverse correlation with the observed species (r = − 0.232, P = 0.004) and a direct correlation with the weighted UniFrac Distances (r = 0.157, P = 0.001) as shown in Fig. 3d-f.
Together, the results suggested that a greater abundance of Bacillus and Lactobacillus promoted a richer microbiota. On the other hand, a greater abundance of Methyloversatilis and Schlegelella impeded the richness of the microbiota which might result in an imbalance. These also indicated that the MDI, the ratio of above taxa, could not only represent the compositional imbalance, but also reflect the structural imbalance to a certain extent.
3. Microbial dysbiosis index is associated with HNC prognosis independently
Next, we seek to investigate the association of the microbial dysbiosis index and the prognosis of HNC. Firstly, we evaluated discriminating value of the MDI in prognosis indication of HNCs using time dependent receiver operating characteristics (ROC) analysis. The MDI exhibited an excellent performance in estimating overall survival with an AUC of 0.708 for three-year survival estimation (Fig. 4a), and was much improved compared to the use of observed species or single taxa (Additional File 3: Figure S1). In addition, a higher cutoff value applied for the MDI presented an even better discrimination for one-year overall survival (AUC of 0.769, Fig. 4a).
Secondly, we performed Kaplan Meier analyses to investigate the association of the MDI with survival of HNCs. Median overall survival of all the 158 patients in this cohort was 61 months. The median overall survival of the patients with the high MDI (46 months with 95% CI of 33 months to not reached) was higher compared with that of the patients with the low MDI (not reached, log-rank P < 0.001, Fig. 4b). Consistently, median disease free survival of the patients with the high MDI patients (43 months with 95% CI of 35 to 83 months) was higher compared with that of the patients with the low MDI (not reached, log-rank P < 0.001, Fig. 4c).
Thirdly, we performed Cox regression analyses of the microbial and the clinic-pathological characteristics before multivariate Cox regression. The MDI and nearly all differentially abundant taxa (except for Bacillus), but not the observed species, were associated with overall survival. Lymph node metastasis, TNM staging and p16 status were associated with overall survival, age and differentiation were also associated with disease free survival in univariate Cox regression (Additional File 3: Figure S2). In multivariate Cox regression, the MDI was a prognostic indicator of both overall survival and disease free survival characteristics independent of age, TNM staging, differentiation and p16 status (HR 3.24; 95% CI, 1.88 to 5.59; P < 0.001 and HR 2.11; 95% CI, 1.34 to 3.34; P = 0.001, respectively; Fig. 4d). We further evaluated discriminating value of the risk score of above multivariate Cox model (a factor considering the effects of MDI, age, TNM staging, differentiation and p16 status) in prognosis indication of HNCs using ROC analysis. The risk score exhibited an improved performance in estimating overall survival (AUC of 0.831 for one-year survival and 0.793 for three-year survival; Fig. 4e).
In addition, we reviewed 12 articles describing differentially enriched taxa in oral microbiota and tissue microbiota between HNCs and controls (Additional File2: Table S3). Although the taxa differentially enriched in the cases and controls (or tumor tissue and adjacent/contradictory controls) were discriminating, we analyzed all above taxa succeeded to be annotated in our study and identified the genera Bacteroides, Enterococcus (enriched in oral cavity of the cases), Rothia and Veillonella (enriched in the tissue of the cases) to be associated with good overall survival (Additional File 3: Figure S3). These results indicated that the risking taxa of incidence might not associate with poor prognosis, or vice verse.
Collectively, these data proved that the microbial dysbiosis index exhibited an excellent performance for survival prediction compared to the use of single taxon including the differentially enriched taxa and the HNC incidence associated taxa. The MDI showed a strong and an independent association with the overall survival of HNCs comparable to the TNM staging. The risk score combining the effects of the MDI, age, TNM staging, differentiation and p16 status, is an excellent prognostic indicator of HNCs.
4. Microbial dysbiosis index complement TNM staging or HPV status in prognostic prediction of HNCs
TNM staging and HPV status have been generally accepted as prognositic indicators of HNCs, therefore, we performed following analyses to investigate whether MDI could complement TNM staging and HPV status. Firstly, we compared the MDI between early stage and late stage patients (scatters in Fig. 5a), as well as between HPV negative and HPV positive patients (scatters in Fig. 5b). Although no differences were observed, higher MDI distributions were shown in early stage patients and HPV positive patients. Cochrane-Mantel-Haenszel tests demonstrated that the high MDI was associated with poor overall survival (OS ≤ 3 yrs) in both early TNM patients and late TNM patients or both p16 negative patients and p16 positive patients (pies in Fig. 5a and 5b).
Secondly, we conducted stratified Kaplan-Meier analyses and subgroup Cox regressions. The late stage patients with the high MDI exhibited significant poor overall survival (median of 30.5 months, log-rank P < 0.001, Fig. 5c). Interestingly, the survivals of the high MDI patients and the low MDI patients were significantly different in the late stage patients (P < 0.001), but the difference in the early stage patients was not significant (P = 0.066). On the other hand, the survivals of the early stage patients and the late stage patients were significantly different in the patients with high MDI (P = 0.028), but the difference in the patients with low MDI was not significant (P = 0.107). In terms of p16 status stratification, the survivals of high MDI patients and low MDI patients were significantly different in both the p16 positive patients (P = 0.010) and the p16 negative patients (P < 0.001). However, no significant differences were observed between the p16 positive patients and the p16 negative patients in the patients with high MDI (P = 0.100) or in the patients with low MDI (P = 0.053). Consistently, in the subgroup Cox regression analysis, the MDI showed higher risk in the late stage patients (HR 3.35; 95% CI, 1.73 to 6.49; P < 0.001) compared with the early stage patients (HR 2.41; 95% CI, 1.00 to 5.84; P = 0.051) (Fig. 5e). And the MDI showed both high risks in the p16 positive patients (HR 4.77; 95% CI, 1.43 to 15.88; P = 0.011) and the p16 negative patients (HR 3.37; 95% CI, 1.88 to 6.03; P < 0.001). These results indicated that the MDI show more distinctive efforts in the late stage patients. Furthermore, these results also suggested that the MDI might showed more association with the overall survival compared to p16 status.
Thirdly, we evaluated discriminating value of the MDI in three-year overall survival prediction of HNCs using ROC analysis (Fig. 5f-5 g). The MDI showed significant higher AUC compared to p16 status (AUC of 0.708 and 0.581 respectively, P = 0.050), but not compared to TNM staging (AUC of 0.708 and 0.644 respectively, P = 0.362). Consistently, p16 status together with the MDI exhibited an improved performance compared to p16 status only (AUC of 0.719 and 0.581 respectively, P = 0.024). However, the improvement of TNM staging together with the MDI was not significant compared to TNM staging only (AUC of 0.730 and 0.644 respectively, P = 0.171). TNM staging, p16 status together with the MDI exhibited an excellent performance with an AUC of 0.736 and a sensitivity of 0.971. These results suggested that the MDI could complement the prognostic indication of HNCs for p16 status to a considerable extent, but a limited extent for TNM staging.
5. Tumor behavior and inflammatory infiltration modulation might contribute to the association of microbiota and the prognosis of HNCs
To seek to understand how the microbiota influences the outcomes of HNCs, we analyze the correlation of the microbial signature and the clinic-pathological characteristics. We found that the greater abundances of Gammaproteobacteria and Shlegelella were correlated with elder age, and a greater abundance of Proteobacteria was correlated with lower differentiation, whereas, Lactobacillales was correlated with smaller tumor size (Fig. 6a). Furthermore, significant microbial differences were observed in the patients with differing age, lymph node metastasis or differentiation, but not in the patients with differing gender, smoking or drinking status, tumor site, histological type or tumor size (Additional File 3: Figure S4-5). Essentially, the differential taxa enriched in poor prognosis, elder age, positive lymph nodes or late stage belong to the phylum Proteobacteria, by contrast, the taxa differentially enriched in good prognosis, negative lymph nodes or early stage belong to the phylum Firmicutes (Additional File 3: Figure S6). These findings suggested the microbiota might influence the patient outcomes through affecting the tumor behaviors, such as lymph node metastasis, potentially.
To investigate whether the bacteria affect the outcomes of HNCs through tumor microenvironment modulation, we analyze the correlation of the microbial signature with the overall inflammatory infiltration indicated by Klintrup-Mäkinen (K-M) score [31, 32]. Greater abundances of Methyloversatilis and/or Schlegelella, and less abundances Bacillus and/or Lactobacillus were distributed in none inflammatory infiltrated patients (Fig. 6b). No differences were found between the K-M score and the abundances of above differentially abundant taxa (P = 0.109 for Lactobacillus, Fig. 6c-f). However, higher Lactobacillus was observed in highly inflammatory infiltration compared to none inflammatory infiltration (P = 0.042, Fig. 6d). As illustrated in the Alluvial plots in Fig. 6g-6j, the inflammatory infiltration did not affect the association of the differentially abundant taxa with the overall survival, except for Lactobacillus. These results suggested that Lactobacillus promoted the inflammatory infiltration to a certain extent, which might contribute to its association with the outcomes of HNCs. However, the other differentially abundant taxa, especially the risk taxa, Methyloversatilis and Schlegelella, might not affect the outcomes of HNCs through the modulation of inflammatory infiltration.