The microbial composition of feces, tissues and sputum is quite different between LC patients and LN.
The feces, tissue and sputum flora of LC patients are significantly different from those of healthy people. Using PCoA analysis based on the Bray Curtis distance algorithm, we found that feces, tissue and sputum, the LC group and the LN group can be distinguished significantly (Fig1. A). It might indicate that in the relative abundance of bacteria, sputum, lung tissue and feces are significant differences between the healthy group and the control group. Among the consistent bacteria of these three groups, the microbiome in feces, sputum and tissues have different relative abundance (Fig1. B). The boxplot indicated that the overall trends of fecal microbe and sputum microbe are more resemblance, while the evolutionary tree also shows that bacteria enriched in feces, tissues and sputum have evolutionary similarities (Fig S1). The tissue microbe is very different from these of the feces and sputum (Fig1. B), indicating that most bacteria do not migrate directly.
Compared to the sputum microbiome, the lung tissue microbiome is more consistent with the fecal microbiome.
Cluster analysis is performed on the three sets of data based on the weighted unifrac emperor algorithm (Fig2. A). The results showed that the composition of fecal microbiome and sputum microbiome has a longer distance. Using the Permutational multivariate analysis of variance (Permanova) distance to visualize the distance between the three groups of samples (Fig2. B, C and D), it can be seen apparently that the distance between sputum, tissue, and feces. The closest distance to tissue is feces (Fig2. D), and the closest distance to feces is sputum (Fig2.C and E). The microbiome of feces has closest distance to sputum and tissue (Fig2. C and D). Those results showed a vital importance of fecal microbiome to tissue.
The relative abundance of bacteria can distinguish most LCs from LNs samples.
In order to evaluate the different distinguishing ability of these body sets of data for LC, and explore the phenomenon causes the differences in classification ability. We used the random forest model to determine the most influential position of microbiome between LC and LN patient (Fig3. A).
We used the random forest model to determine the characterization of microorganisms in three environments. Defining the AUC greater than 0.5 indicates that the difference between the two groups of LC and LN samples is significant. The AUC of all 339 specimens is 0.733, the AUC of fecal samples is 0.674, and the AUC of tissue samples is 0.703 (Fig3. A). The highest classification result is sputum, with an AUC of 0.900, which is consistent with the previous study  (Fig3. A). Most of the top 15 genus that affect the classification effect are unrecognized bacteria (Fig S2).
In order to find out the reason that RF model classification effect of sputum is the best, we did analysis on the α diversity of microbiome in these three environments (Fig3. B). The results of the diversity analysis showed that sputum had lower diversity than tissue and fecal samples (Fig3. B). Due to the lower microbial α diversity, there are fewer significant bacteria related to the occurrence and development of LCs. We then used Picrust2 to predict the relative abundance of enzyme genes in these samples, and calculated the count of each EC (Enzyme Commission’s classification) type of enzymes with significant differences (|log2 FC| > 1) between LCs and control LNs (Fig3. C and D). We calculated the ECs of three groups using two-tailed t-test, and found that there is no significant difference in composition classification between EC and KO in the three environments (Fig3. C and D). Among them, Veillonella in feces. Streptococcus in sputum and Streptococcus and Veillonella in tissues were also found in the lower airways of patients with lung cancer. Those genera were enriched for oral taxa, which was associated with up-regulation of the ERK and PI3K signaling pathways .
Significant different KO, EC and pathways in feces and tissues are more resemblance.
To explore the relationship between bacterial differences and functional differences between the LC and LN groups, we used picrust2 to predict the microbial community’s functional composition, and map the significant differences (|log2FC| >= 1) between the LC and LN groups. The Venn plot showed the count and intersections of significant different ECs and KOs (KEGG Orthology) between tissues, feces and sputum (Figure4. A and B). The genes and enzymes that have undergone significant changes in the three environments, and the feces and tissues have higher Jaccard similarity in enzymes and genes (Figure4. A). Then we plot a heatmap on the relative abundances of the two specific genes in these parts, and found that most of the common differential enzymes in feces and sputum does not exist in tissue (40/21; 52.5%), and the rest of have no relative abundance changes (FigureS3. A and B). Most of the differential enzyme genes in feces and tissues are also present in sputum (64/43; 67.2%), others have no significant change in sputum (Figure4. C). The relative abundance changes revealed the functional differences between tissues and fecal microbiome are closer than that of feces and sputum. The main common differential enzymes in feces and tissues are EC1, while the main differential enzyme type between feces and sputum is EC2, as well as tissue and sputum (Figure4. C and Figure S3).
The metabolic pattern of sputum microbe in LC patients is significantly different from that of feces and tissue.
We used the LDA score (P<0.05) of the lefse software to screen out the genus that had significant changes in the three environments (Fig. S4). And by comparing the KEGG database and the Metacyc database, it is found that these marker genus and significantly different pathways have different relationships in different environments. The heat map shows that the sputum marker genus has significant differences in metabolic pathways (Fig. 5A and S5). The bacteria related pathway enriched in the feces of healthy people is positively correlated with the D−galactarate degradation I, D−glucarate degradation I and superpathway of D−glucarate and D−galactarate degradation (Fig5. B, D, E and FigS5), while the pathway of sputum in healthy people is negatively correlated (Fig5. B, D, E and FigS5). The bacteria related pathway enriched in sputum and feces of LCs is positively correlated with lactose and galactose degradation I pathway, while the pathway of tissue in LCs is negatively correlated (Figure5 C). Indicating in intestinal microbes, the relative abundance of metabolic pathway using glucose in the in LC patients has decreased, while the glucose metabolism pathways of sputum have increased (Fig5. B, D, E), and the degradation activities of LCs also increased (Figure S5). This might explain why the sputum of LC patients has a higher microbial diversity, because of increase of glycan and degradation activity in the microbe of LC sputum. The reduction of biosynthetic activities and the increase of organic substances have made organic substances in the sputum environment more complicated . Through enrichment analysis of the differential pathways and differential microbe in the three environments, we found that the microbe involved in monosaccharide metabolism decreased in the patient’s feces and increased in the sputum (Figure S5). The pathways related to glucose metabolism have the most extensive correlation with the marker genus that causes LC (Figure S6). These results indicating that the glycolysis metabolic feature has significantly changed between LCs and LNs.