Bacterial richness, abundance, and diversity
We identified 3,073 unique OTUs from a total of 7.63 million sequences in the floor dust samples from the 499 classrooms, including 29 phyla, 57 classes, 148 orders, 320 families, 1,193 genera, and 2,045 species. Of the total 3,073 OTUs, 1,028 were not identifiable to the class or lower level. Among the top five richest phyla, the Proteobacteria had the largest number of OTUs (922 identified OTUs at the class level), followed by Firmicutes (770), Actinobacteria (669), Bacteroidetes (414), and Cyanobacteria (66) (Figure 1a). At the class level, Actinobacteria (the phylum Actinobacteria, 605 identified OTUs at the order level) was richest, followed by Bacilli (Firmicutes, 460), Gammaproteobacteria (Proteobacteria, 380), Alphaproteobacteria (Proteobacteria, 232), and Clostridia (Firmicutes, 218).
However, the rank order of the top five richest phyla identified was not concordant with that of the top five most abundant phyla. The phylum Firmicutes was most abundant (relative abundance: 0.29), followed by Proteobacteria (0.28), Actinobacteria (0.18), Cyanobacteria (0.14), and Bacteroidetes (0.09) (Supplemental Figure 1). The order Lactobacillales was most abundant (relative abundance: 0.14 in the phylum Firmicutes), followed by Spirulinales (0.11; Cyanobacteria), Clostridiales (0.07; Firmicutes), Bacteroidales (0.07; Bacteroidetes), Pseudomonadales (0.06; Proteobacteria), and Micrococcales (0.06; Actinobacteria). Halospirulina was the most abundant genus (the only genus within the order Spirulinales, relative abundance=0.11), followed by Lactobacillus (0.07), Streptococcus (0.03), Sphingomonas (0.03), Clostridium (0.03), and Pseudomonas (0.03) (Figure 1b). Of the most abundant top ten genera, only three were also in the richest top ten genera that included Lactobacillus (57 species identified), Corynebacterium (45), Bacillus (45), Paenibacillus (31), Mycobacterium (28), Bacteroides (25), Pseudomonas (24), Provotella (23), Nocardiodes (23), and Flavobacterium (23). We identified 15 Staphylococcus species with 0.018 in relative abundance including unidentified species and five Propionibacterium species with small relative abundance (<0.001). Gram-negative bacteria were more abundant (relative abundance=0.54) and richer (1,632 OTUs, 53%) than gram-positive bacteria [0.46; 1,441 OTUs (47%), respectively] in the bacterial community of these schools.
The number of bacterial genera identified in the 50 schools ranged from 470 to 705 (median=577) and the number ranged from 44 to 397 (246) for the 499 classrooms (Figure 2). The Shannon-Weaver diversity index ranged from 3.61 to 4.72 (4.14) for the schools and from 0.62 to 4.75 (3.64) for the classrooms. The Pielou’s evenness index ranged from 0.57 to 0.72 (0.65) for the schools while the index ranged from 0.13 to 0.80 (0.67) for the classrooms. The Bray-Curtis index (1,225 unique pairs of schools) ranged from 0.23 to 0.63 (0.41), indicating that one-half of paired schools were at least 40% dissimilar in their genus composition. For the classrooms, the Bray-Curtis index (more than 124,000 pairs) ranged from 0.08 to 0.99 (0.66).
Relative abundance of dominant genera and hierarchical clustering of schools
We examined relative abundance of the top ten genera within each school. In 32 of 50 schools (64%), cumulative relative abundance of the top ten genera was 0.4 or higher (Figure 3). The cumulative relative abundance of the genera Halospirulina and Lactobacillus was higher than any other genus for all schools, except for school number 34 where Enterococcus was more abundant than the summation of the two. The genus Pseudomonas was most abundant as a single genus in schools 46 and 49. The most abundant top ten genera in all classrooms included Halospirulina (relative abundance: 0.12), followed by Lactobacillus (0.07), Streptococcus (0.03), Sphingomonas (0.03), Clostridium (0.03), Pseudomonas (0.03), Acinetobacter (0.02), Enterococcus (0.02), Corynebacterium (0.02), and Methylobacterium (0.02) (Supplemental Figure 2).
Figure 3 presents four clusters by hierarchical clustering dendrogram of 50 schools and each of them showed characteristic genus composition. The cluster A included schools with Halospirulina at a medium level in relative abundance (~0.1 within the cluster) and Bacillus (0.075) along with low abundance of Lactobacillus (<0.02) (Supplemental Figure 3). The cluster B included 18 schools with the highest within-school relative abundance of Halospirulina (~0.2). The cluster C was composed of schools with lower relative abundance (0.06) of Halospirulina compared to other clusters along with medium abundance of Lactobacillus (~0.07) and higher abundances of Sphingomonas and Pseudomonas than other clusters. The cluster D consisted of schools with higher abundance of Lactobacillus (0.12) than those in other clusters. In the clusters A and C, the cumulative relative abundance of the top 10 genera was generally lower than the clusters B and D. When average water damage scores were compared among the clusters, cluster A had a significantly lower score than cluster D. Cluster D had the highest mean score of all the clusters (score of the cluster D > C > B > A). Multiple comparison adjusted with Tukey HSD showed that all pairwise comparisons were significantly different, except that two pairs of clusters (A and B, and C and D) were not different and that the clusters A and C were marginally different (Supplemental Figure 4).
Association of richness and community composition with school/classroom characteristics
Distributions of continuous environmental variables and their correlations with the Shannon-Weaver index are presented in Figure 4. Shannon-Weaver index was not associated with the average water damage score and the FCI score; however, it was positively but weakly correlated with the number of students in classroom (correlation coefficient=0.09, p-value=0.06). It was also weakly and negatively associated with air RH (-0.12, 0.01) and temperature (-0.12, <0.01). The FCI scores were moderately correlated with air RH (-0.35, p<0.001) and air temperature (0.45, p<0.001).
We created a rarefied genus accumulation curve for each level of categorical environmental variable (Figure 5). The steepest slope of the initial accumulation curve and the highest plateau for the schools in the southwestern area within the city indicated the highest proportion of relatively abundant genera and the highest richness, respectively. Interestingly, all three schools with the highest proportion of relatively abundant genera (Figure 5) were categorized in cluster C (Figure 2), which was also revealed by the steeper slope of the initial rarefaction curves than any other clusters (Supplemental Figure 5). All three schools showing the lowest proportion of relatively abundant genera were in cluster D and located in the northeastern area of the city. The continuous increase of rarefaction curves for clusters B and D indicated the presence of many rare genera (Supplemental Figure 5). The schools in need of the least repair (Q1, the first quartile of the school FCI scores) had the higher proportion of relatively abundant genera and richer than any other FCI groups. The number of students in the classroom did not influence bacterial richness. School groups by quartile of classroom average water damage score showed slight differences in richness (similar height of plateau), but the most damaged schools (Q4) had a higher proportion of abundant genera compared to other groups. Schools with the lowest temperature had the richest and higher proportion of abundant genera while the effect of RH on richness was minimal.
ANOSIM (Figure 6) results indicated that the effects of the categorical variables on community composition were small (R values < 0.03) but significant, except for the type of floor materials. The classrooms in schools in need of least repair (Q1 of FCI score) were more dissimilar in composition than those in need of more repair (the group needing most repair was least dissimilar). The physical condition of the building (FCI score) affected dissimilarity the most among the environmental variables. Genus dissimilarity did not differ by classroom floor material type; however, it differed by floor levels of the classrooms (R=0.02, p-value < 0.01), with the first floors generally being the most dissimilar. There was a tendency that dissimilarity increased as water damage score or RH increased; whereas those with highest temperature in air were least dissimilar. The number of students had a marginal effect with classrooms with most students (Q4) being least dissimilar. All of the full and reduced multivariate models (PERMANOVA) adjusted for other environmental variables indicated that all of the environmental factors significantly affected dissimilarity in genus composition of paired classrooms although the effect was small (Table 1), which was consistent with the results of the univariate ANOSIM analyses. Mean water damage score of school classrooms in southwestern or northern region of the city were significantly higher than those in northeastern or southeastern region (Supplemental Figure 6). Because of this correlation, we constructed reduced PERMANOVA models without school area or water damage that yielded results similar to those of the full model. The PERMANOVA models also indicated that the effects of the physical condition of the building, area of the school, and classroom air temperature were greater than other environmental variables.
The first two of the three dimensions in unconstrained NMDS is presented in Figure 7. Stress values (0.1) indicate three-dimensional ordination is fair. Abundant phyla such as Firmicutes, Proteobacteria, Actinobacteria, Cyanobacteria, and Bacteroidetes were all clustered together in the majority of the schools. Schools in the northeastern area of the city had a distinct community composition compared to those in other areas. Schools in need of the least repair (Q1) had a unique genus composition, especially compared with those in Q2 and Q3. Schools with the least water damage (Q1) also had a unique composition compared with others, especially Q2 and Q3. The classrooms with the most students (Q4) had a different composition from those in Q1 and Q2. However, quartile groups of RH, and temperature did not show characteristic composition in NMDS.