Patient cohort and sequencing quality
There were 8 samples of AD, 10 samples of SCC, 5 samples of small cell lung cancer (SCLC), and 3 samples of combined-small cell lung cancer (C-SCLC). Among all the patients, 19 had smoking history and 6 had chemotherapy history. Patients’ basic information is presented in Table 1. By using 16S rRNA sequencing, a total of 2,388,223 original sequences were obtained from 26 samples, and 1,764,852 high-quality sequences were obtained after splice and filtration. The average sequence length of all samples was 415–427 bp, and bases with mass value greater than or equal to 30 accounted for 94.61–95.53 % of the total base number. Through ITS sequencing, a total of 2,058,555 original sequences were obtained from 26 samples, and 1,747,974 high-quality sequences were obtained after splicing and filtering. The average sequence length of all samples was 215–263 bp. And bases with mass value were greater than or equal to 30 accounted for 94.31–98.34 % of the total base number.
Table 1
Clinical and pathologic characters of enrolled patients
Feature | AD | SCC | SCLC | C-SCLC |
number | 8 | 10 | 5 | 3 |
Gender | | | | |
Male | 5 | 10 | 5 | 3 |
Female | 3 | 0 | 0 | 0 |
Drinking history | | | | |
Yes | 3 | 8 | 2 | 3 |
No | 5 | 2 | 3 | 0 |
Smoking history | | | | |
Yes | 5 | 8 | 3 | 3 |
No | 3 | 2 | 2 | 0 |
Chemotherapy history | | | | |
Yes | 2 | 3 | 2 | 0 |
No | 6 | 7 | 3 | 3 |
AD: adenocarcinoma; SCLC: small cell lung cancer; SCC: squamous cell carcinoma; C-SCLC: combined-SCLC. |
Bacterial microbial composition is basically similar, but there are some species have statistical difference
In total, 13 kinds of bacteria phylum, 108 kinds of bacteria genus, and 52 kinds of bacteria species were found. The pulmonary bacterial flora of LC patients mainly belongs to Firmicutes (34.48 %), Proteobacteria (22.88 %), Bacteroidetes (22.60 %), and Actinobacteria (9.28 %). The main genera are Streptococcus (21.35 %), Neisseria (15.57 %), Prevotella (9.73 %) and Veillonella (5.37 %). They have different proportions in each group (Figs. 1A and 1B). Metastats software was used to carry out T test on species abundance data between groups. Firmicutes was more abundant in SCC than in AD and SCLC. It's interesting to note that the abundance of Firmicutes in SCC (38.7 %) was significantly higher than that in AD (29.6 %) (P = 0.017). However, most of the family level species with significant difference in abundance between AD and SCC belonged to Bacteroidetes. It suggests that the difference between SCC and AD in Firmicutes is not caused by a few species, however, by a wide range of species abundance changes. Synergistetes and Spirochaetes were more abundant in AD than in SCC and SCLC. And the difference of their abundance between AD and SCC was statistically significant (P < 0.05). The species with statistically differences were summarized in Table 2. AD and SCC have the largest number of different species.
Table 2
The significant difference analysis between groups
Phylum | AD | SCC | P value |
Mean | Variance | Std.err | Mean | Variance | Std.err |
Synergistetes | 1.60E-03 | 2.62E-06 | 5.73E-04 | 1.69E-04 | 7.98E-08 | 8.93E-05 | 0.004 |
Firmicutes | 2.96E-01 | 4.27E-03 | 2.31E-02 | 3.87E-01 | 5.69E-03 | 2.39E-02 | 0.017 |
Spirochaetes | 1.54E-02 | 2.45E-04 | 5.54E-03 | 3.70E-03 | 2.52E-05 | 1.59E-03 | 0.047 |
Family | Mean | Variance | Std.err | Mean | Variance | Std.err | P value |
Desulfovibrionaceae | 4.19E-04 | 6.85E-07 | 2.93E-04 | 2.78E-06 | 7.75E-11 | 2.78E-06 | 0.001 |
Marinifilaceae | 3.03E-04 | 2.47E-07 | 1.76E-04 | 5.42E-06 | 1.31E-10 | 3.62E-06 | 0.001 |
Sphingobacteriaceae | 0.00E + 00 | 0.00E + 00 | 0.00E + 00 | 2.56E-04 | 6.57E-07 | 2.56E-04 | 0.001 |
Lentimicrobiaceae | 6.60E-04 | 4.90E-07 | 2.47E-04 | 3.89E-05 | 5.04E-09 | 2.25E-05 | 0.003 |
Synergistaceae | 1.60E-03 | 2.62E-06 | 5.73E-04 | 1.69E-04 | 7.98E-08 | 8.93E-05 | 0.004 |
Propionibacteriaceae | 3.07E-04 | 1.55E-07 | 1.39E-04 | 4.19E-05 | 2.92E-09 | 1.71E-05 | 0.011 |
Defluviitaleaceae | 2.82E-04 | 3.61E-08 | 6.72E-05 | 5.93E-05 | 1.55E-08 | 3.93E-05 | 0.014 |
Bacteroidaceae | 3.33E-03 | 4.60E-05 | 2.40E-03 | 9.74E-05 | 3.34E-08 | 5.78E-05 | 0.028 |
Rikenellaceae | 6.36E-04 | 3.61E-07 | 2.12E-04 | 1.24E-04 | 6.26E-08 | 7.91E-05 | 0.042 |
Family | AD | SCLC | P value |
Mean | Variance | Std.err | Mean | Variance | Std.err |
Tannerellaceae | 1.25E-03 | 3.66E-07 | 2.14E-04 | 3.99E-04 | 1.73E-07 | 1.86E-04 | 0.004 |
Lentimicrobiaceae | 6.60E-04 | 4.90E-07 | 2.47E-04 | 1.26E-05 | 7.94E-10 | 1.26E-05 | 0.008 |
Leptotrichiaceae | 4.35E-02 | 7.68E-04 | 9.80E-03 | 1.73E-02 | 4.20E-05 | 2.90E-03 | 0.009 |
Phylum | SCC | SCLC | P value |
Mean | Variance | Std.err | Mean | Variance | Std.err |
Tenericutes | 5.95E-04 | 4.16E-07 | 2.04E-04 | 9.18E-05 | 3.17E-08 | 7.96E-05 | 0.041 |
Family | Mean | Variance | Std.err | Mean | Variance | Std.err | P value |
Leptotrichiaceae | 3.44E-02 | 4.08E-04 | 6.39E-03 | 1.73E-02 | 4.20E-05 | 2.90E-03 | 0.017 |
Ruminococcaceae | 4.55E-03 | 1.65E-05 | 1.28E-03 | 1.40E-03 | 1.30E-06 | 5.11E-04 | 0.024 |
Flavobacteriaceae | 1.47E-02 | 2.27E-04 | 4.76E-03 | 3.67E-03 | 6.05E-06 | 1.10E-03 | 0.026 |
Mycoplasmataceae | 4.61E-04 | 3.09E-07 | 1.76E-04 | 5.82E-05 | 1.59E-08 | 5.63E-05 | 0.031 |
Actinomycetaceae | 2.65E-02 | 2.39E-04 | 4.89E-03 | 1.50E-02 | 4.33E-05 | 2.94E-03 | 0.045 |
Alpha diversity showed differences among groups (Figs. 1C and 1D). Chao1 assesses species abundance. And Shannon assesses species diversity which is affected by species abundance and uniformity in sample communities. AD has a higher Chao1 value than SCC (P = 0.0088) and SCLC (P = 0.0042). AD also has a higher Shannon value than SCC (P = 0.0324). Both species abundance and species diversity of AD is the highest. And the alpha diversity of SCC is the lowest. There was no statistical difference between SCC and SCLC in alpha diversity.
The similarity of bacterial species diversity was higher in AD than in SCC
According to the results of NMDS, the distribution of each sample in SCC and SCLC is dispersed, while in AD is concentrated (Fig. 2A). According to the Heatmap results, the color gradient from blue to red indicates the distance between samples from near too far. It was further observed that AD, SCC, and SCLC formed separate clusters except for individual samples. The within-group difference AD was the smallest and SCC was the largest. And SCLC is between AD and SCC. AD and SCC showed great between-group difference (Fig. 2B). PERMANOVA (permutational multivariate analysis of variance) was used to further verify the results of NMDS and Heatmap. AD has the smallest beta distance and SCC has the largest beta distance. The beta distance between groups was slightly larger than within groups. P value is less than 0.05 meaning the high reliability of the test (Fig. 2C). In conclusion, among the three groups, the AD group had the highest bacterial species diversity similarity, however, the SCC group had the lowest species bacterial diversity similarity.
C-SCLC differs from SCLC in bacterial diversity
Although C-SCLC has the same histological components as SCLC. Its bacterial species diversity is different from SCLC. Proteobacteria abundance in SCLC is greater than C-SCLS (P = 0.0238). In genus level, the abundance of Neisseria, Aggregatibacter, Haemophilus, Peptostreptococcus, Gemella, Stomatobaculum, and Parvimonas in the SCLC was significantly greater than that in the C-SCLC (P < 0.05), however, the abundance of Tannerella in the SCLC was significantly less than that in the C-SCLC (P < 0.005). The sample hierarchy cluster tree is obtained through UPGMA. And C-SCLC and SCLC samples are clustered into two separate clusters (Fig. 3A). PERMANOVA results showed that the beta distance of C-SCLC was greater than that of SCLC. The beta distance between two groups was significantly larger than within groups (Fig. 3B). These data suggested that there was a difference in the composition of bacterial species between C-SCLC and SCLC.
Smoking affect the sputum bacteria composition: In AD group, the ratio of smoking to non-smoking was 5:3. In phylum level, the abundance of Spirochaetes and Bacteroidetes in the smoking group was significantly greater than that in the non-smoking group (P < 0.05). And the abundance of Actinobacteria in the smoking group was significantly less than that in the non-smoking group (P = 0.0106). In genus level, the abundance of Mogibacterium, Prevotella-2, Treponema-2, Pseudomonas, Alloprevotella, Fretibacterium, Leptotrichia, and Ralstonia in the smoking group was significantly greater than that in the non-smoking group (P < 0.05), however, the abundance of Corynebacterium and Rothia in the smoking group was significantly less than that in the non-smoking group (P < 0.05). The abundance of many species has increased in the smoking group, which suggesting that smoking may increase the risk of bacterial invasion. The results of the Heatmap showed that the bacteria species diversity was more similar within groups rather than between groups (Fig. 4A). Although the A11 sample was in the smoking group, its species diversity was more similar with non-smoking group. Tracking the clinical information of A11 samples found that it’s from a patient who had quit smoking for more than 10 years. This suggests that quitting smoking may lead to a gradual shift in respiratory bacteria to non-smoking status. PERMANOVA showed that the beta distance was significantly greater in the smoking group than in the non-smoking group (Fig. 4B). These phenomena were not observed in the fungal sequencing results (data not shown). Altogether, these data suggests that smoking can reduce diversity similarity of bacteria.
Fungal composition of sputum in lung cancer patients
Fungal sequencing reads were also clustered. The OTU value of each sample is 79–349. A total of 1324 OTUs were obtained from the 26 samples. The number of OTU shared by each groups was small (Fig. 5A). In total, 10 kinds of fungal phylum, 328 kinds of fungal genus, and 359 kinds of fungal species were found. Unlike bacteria, there are exist many unclassified species. The main fungal composition of sputum of LC patients at phylum level is Ascomycota (74.77 %), Basidiomycota (11.89 %), Mucoromycota (1.39 %), Rozellomycota (1.22 %), and Unclassified (10.50 %) (Fig. 5B). And the composition at genus level is Candida (18.52 %), Cladosporium (8.32 %), Fusarium (5.42 %), Leptobacillium (2.98 %), Aspergillus (2.64 %), and Unclassified (23.63 %).
The diversity of fungi varies greatly among individuals
The Chao1 value of AD group is greater than that of SCC group (P = 0.0328). And the Chao1 value of C-SCLC is less than the other three groups (P < 0.05). These suggest that AD fungus species abundance was higher than SCC. And C-SCLC fungus species abundance was lower than the other three groups (Fig. 6A). The beta distance between each sample is far (Fig. 6B). It indicates that unlike bacteria, the fungal composition of the sputum of LC patients has great individual differences. And there was no obvious clustering of fungal diversity among the groups.
Correlations between bacteria and fungi
Spearman's rank correlation coefficient [37] was used to analyze the correlation between bacterial and fungal species abundance. The correlation between the top three abundance species at the phylum level and the top five abundance species at the genus level in bacteria and fungi was analyzed, respectively. Scatter diagrams showing the correlations between bacteria and fungi were plotted in Fig. 7. There is negative correlation between Proteobacteria and Ascomycota (P = 0.05). Fusarium was positively correlated with Streptococcus, Neisseria, Prevotella-7, and Veillonella (P < 0.05). And Penicillium was positively correlated with Neisseria (P < 0.05).
Inflammatory cytokines and respiratory microbiome
According to the median of WBC, neutrophil ratio, lymphocyte ratio, and C-reactive protein (CRP), the AD group was divided into low-median inflammatory factor group and high-median inflammatory factor group. Alpha diversity and beta diversity of bacteria and fungi were analyzed. There was no statistical difference in the Chao1 and Shannon values between the high and low median inflammatory factor groups (P > 0.05) (suppl-table 1, suppl-table 2). In terms of beta diversity, there was no obvious similarity in the species diversity of bacteria or fungi in the two groups. In the SCC group, also did not observe the correlation between inflammatory factors and microbiome diversity.