Seasonal changes of the airborne microbiome in a large urban area

Background Compared to soil or aquatic ecosystems, the atmosphere is still an underexplored environment for microbial diversity. Besides its ecological importance, the spatial and temporal characterization of aerosolized microorganisms is relevant for understanding allergy and disease outbreaks, especially in highly populated cities. Results In this study, we surveyed the composition, variability and sources of microbes (bacteria and fungi) in the near surface atmosphere of a highly populated area, spanning ~ 4,000 Km around the city center of Madrid (Spain), in different seasonal periods along two years. We found a core of abundant bacterial genera robust across space and time, most of soil origin, while fungi were more sensitive to environmental conditions. Microbial communities showed clear seasonal patterns driven by variability of environmental factors, mainly temperature and accumulated rain, while local sources played a minor role. We also identified taxa in both kingdoms characteristic of seasonal periods, but not of specific sampling sites or plant coverage. Conclusions The present study suggests that the near surface atmosphere of urban environments constitutes a stable ecosystem, with a relatively homogenous composition, modulated by climatic variations. As such, it contributes to our understanding of the long-term changes associated to the human exposome in the air of highly populated areas. BACKGROUND The composition and dynamics of the microbial diversity present in the atmosphere is still under intensive research and discovery. Bacteria and fungi propagules constitute a significant fraction of this aerobiota, which are aerosolized from different terrestrial and aquatic ecosystems[1]. Although the atmospheric conditions may not favor their survival, meteorological factors like rainfall or wind currents are key factors affecting their abundance and prevalence in the air [2, 3]. Moreover, air masses can carry these particles across trans-continental distances before being precipitated [4-7]. During their presence in the atmosphere they play an ecological role by acting like ice nuclei, activating cloud formation and triggering the bioprecipitation[8], although a minor part is also withdrawn by dry deposition[9]. Once deposited, microbial interactions start in the new environment and contribute to the biogeochemical cycles[10]. Some times they can also exert a negative effect by disseminating plant and animal diseases throughout both natural and livestock[11, 12]. Furthermore, because of their ubiquity, their adaptable metabolism and the large volume of biomass that they represent as a whole, monitoring microorganisms may be crucial in a climate change scenario[13, 14]. Among all environments, the characterization of these microorganisms in metropolitan areas is attracting much attention because such particles may have harmful consequences on human health. As part of the inhalable fraction, some may trigger allergic reactions, cause pulmonary diseases or aggravate respiratory pathologies[15]. However, the dynamics and composition of the microbial aerosols within the cities is still unclear due to several factors. Firstly, different land-use can provide diverse sources of microorganisms, setting up differences in abundance and diversity compared to areas lacking these elements or having significant particularities[16-19]. Thus, urban parks provide soil and plant-related niches, in addition to fauna-related microbes, while ponds and fountains are sources of aquatic microbial life[20]. In addition, it is known that environmental changes drive important variations in the airborne communities both at short-term[21-23] and associated to seasonal variability [4, 24-28]. Especially relevant are the latter. Since many of the microbial organisms are commensals or saprophytes, they are linked to the life cycle of higher organisms to grow and multiply, which are usually synchronized with seasonal changes, e.g. plant growth and decay. Lastly, atmospheric transport and mixing of air masses as well as extreme atmospheric events (dust intrusions, pollution hazes or storms) may add biological variability across time and space[29-33]. Altogether, the potential influence of these many environmental factors may hinder the characterization of the aerobiota in urban environments. Although urbanization has been proposed to homogenize airborne microbiota [34, 35] the influence of different environmental factors on prokaryotic and eukaryotic diversity in the urban atmosphere is not properly understood yet. Previous works on the effect of seasonality in the air microbiome of urban environments have been conducted in one or a few points in the same city[18, 27, 3639]. On the other hand, metagenomic studies of largely sampled urban areas have been usually restricted to short time periods[17, 35, 40, 41]. Here, we used high-throughput DNA sequencing to simultaneously survey bacterial and fungal communities at 11 different sites scattered throughout a large metropolitan region in Madrid (Spain), across different seasonal periods for two years. The sampling locations are representative of urban scenarios with different degrees of urbanization and population density. Combined with meteorological and air pollution data, the present work provides a comprehensive analysis of the dominant and season specific bacterial and fungal taxa present in the near surface atmosphere of a wide urban area, evaluating the relative contribution of spatial and long-term temporal characteristics and assessing the influence of different environmental and pollution factors. METHODS Sampling sites characteristics Eleven sites scattered throughout the Community of Madrid (Spain) (Fig. 1a), were sampled, covering an area of ~ 4,000 Km around the city center and being representative of different urbanization levels. Community of Madrid is closed to the center of the Iberian Peninsula, ca. 320 Km away from any costal place and 678 meters above mean sea level. Total population is estimated in 6,6 million people differently distributed throughout the territory, with over 3 million living in the city center. Around 16 million journeys are made each day by the population, 43% related to daily commuting (official data from Madrid Regional Transport Consortium, 2018, https://www.crtm.es/). Aerial pictures of the region (Google Earth version 7.3.3; https://www.google.com/earth/download/ge/) were used to examine the percentage of green areas, parks and non-urbanized lots in 1 Km around the sampling point and, accordingly, the sites were classified into “Green” (> 7% non-urbanized area: G1-G3), “Parks” (between 3-7%, P1-P3) or “Built” (<3%, B1-B5). Thus, G sites (mostly found in the north of the region) are mainly residential areas characterized by large wild zones, clearing areas and large green zones. Sampling points in P sites (eastern regions) are set in urban environments but surrounded by short buildings and some urban parks. B sites (center and southern region) are placed in highly built areas, with scarce or small green areas and busy streets around. General district demography associated to the sampling sites showed that G and P places are less populated (median: 2500±1937 and 2563 ±2558 inhabitants/km, respectively), compared to B sites (median: 4359±2181 inhabitants/km) (official data 2018, http://www.madrid.org/iestadis/). Sampling methodology and DNA extraction and quantification Samples at the 11 sites were collected simultaneously using volumetric spore traps (Burkard Manufacturing Co., England, UK). Each sample corresponds to a 7-days collection. Sampling procedures, DNA extraction and quantitation were performed as described previously in [42]. A synchronous sample collection was conducted each season along a period of two years (henceforth year A and B), starting Summer 2015 and finishing Spring 2017. A total of 87 samples were collected (1 sample was missing because of a device failure during collection in site B2, Fall A). High-throughput DNA sequencing High-throughput sequencing analyses were performed using the purified DNA from each sample in a targeted amplicon sequencing (TAS) approach. Hypervariable regions V3-V4 of the 16S rRNA gene of bacteria and region 5.8S – ITS2 of fungi were amplified using the following universal primers attached to adaptors and multiplex identifier sequences: Bakt_341 (F): 5′CCTACGGGNGGCWGCAG -3’; Bakt_805 (R): 5′GACTACHVGGGTATCTAATCC -3’ [43] for bacteria; and ITS86 (F): 5′GTGAATCATCGAATCTTTGAA-3’ [44]; ITS-4 (R): 5′TCCTCCGCTTATTGATATGC -3′[45] for fungi. Purified-amplicon libraries were sequenced in Illumina® MiSeq platform (2×300 bp reads) at Madrid Scientific Park (Madrid, Spain), with a minimum sequencing depth of 100,000 reads/amplicon. 1 sample of the 16S library (G3 Winter B) was discarded because DNA amplification failed, so a total of 86 samples were analyzed for bacteria. DNA from a negative control (sample obtained with the same procedure applied in sample collection but keeping the device off) was also included in the sequencing protocol to discard sample contamination. Sequence preprocessing PANDAseq v2.8 [46] was used for assembling paired-ends sequences of the 16S DNA library, filtering by Q-score quality (0.6), trimming the primers sequences and excluding the sequences exceeding the length of the amplicon (min: 400 bp, max: 500 bp). For fungal sequences, the sequencing protocol exceeded the length of the amplicon, and “read_fastq” from Biopieces v2.0 (http://maasha.github.io/biopieces/) to remove the primer sequence at the end of the amplicon, followed by “fastq-join” [47] (https://github.com/brwnj/fastq-join) were applied instead to pair the sequences. Global processing of the sequences was conducted in Qiime suite environment [48](version 1.9.1, http://qiime.org). Chimeras were excluded using USEARCH v8.1 (https://drive5.com/usearch/) in default mode. OTUs clustering and taxonomic assignments were performed with the default algorithm of Qiime (pick_open_reference_otus.py), using UCLUST as method for picking OTUs [49] at 97% similarity cutoff. Silva database (release_132) for bacteria[50], and UNITE v7.0[51] for fungi were used for the taxonomic assignments. OTUs assigned to chloroplast, mitochondria, “Unassigned” or that did not reach a defined taxonomic rank at Phylum level were filtered out. For bacteria, we conducted an additional manual revision to search for potential contaminations from insect or human-skin microbiota during manipulation, as in[28], identifying a total of 9 OTUs that were removed for further analyses. Filtering and normalization Since spurious OTUs with very low counts may appear due to PCR and sequencing errors [52, 53], we estimated a ‘noise floor’ following the statistical procedure outlined in [28], which resulted in one and two counts for bacteria and fungi, respectively. As a pre-analysis step, we thus removed singletons (OTUs present with an abundance of one count in one sample and zero in the rest) for bacteria, as well as singletons and doubletons for fungi. This procedure eliminated 16,845 OTUs in bacterial samples (30% of the original table) and 3,582 in fungal samples (26%). However, they only represented between 0.3-1.5% (bacteria) or 0.2-0.29% (fungi) of the samples relative abundances. To establish the core of bacterial and fungal genera (taxa, at the Genus level, present in at least 95% of all samples) we considered a more restrictive criterion of ‘presence’ taking into account experimental variability with duplicate and simultaneously running spore traps, as described in [28]. This sets a threshold of 0.032% in relative abundance for an OTU to be reliably observed in a given sample, which we used as a criterion for presence/absence. In all calculations comparing relative abundances (as those shown in Fig. 1 and Figs. S2-S3) we corrected for biases due to differences in size between samples using cumulative sum scaling normalization[54], as implemented in the “metagenomeSeq” package. Environmental and pathogen annotations We collected the top 150 OTUs of every sample (covering at least 70% of relative abundance per sample, giving a total of ~2,000 OTUs) as representatives of the airborne bacterial community. The predicted sources (Fig. 1b) were assigned using the Seqenv pipeline[55], selecting the most frequent annotations from the top 5 matches (>97% similarity) associated to each OTU sequence. The resulting annotations were assigned to 7 different habitats: Animal-related (which includes gutand skin related microbiota), Plant (phyllosphere and rhizosphere-related bacteria), water (including Freshwater, Marine or Aquatic when the annotation was not specific enough to discern the type of aquatic environment) and Soil (including sediment, sand, sludge, desert, and related expressions). Taxa with equal preference for different habitats were assigned to “Generalists”, a relatively common situation for bacteria (many of which are usually isolated, for instance, from plant surfaces as well as soil samples). For those OTUs without environmental assignation but defined to species level, a manual assignment of the ecological habitat was set based on the related literature. A total of 654 (~33%) OTUs could not be assigned to any habitat (NA), mainly sequences from cultureindependent studies and incomplete taxonomic description (rank Family or higher). The same pipeline was used to assign most likely habitats to fungal taxa, including a Lichen category. The ecological guilds for fungal OTUs (Fig. S1) were assigned using the FUNGuild pipeline[56] for the whole set of fungal taxa. We gathered the results in the eleven different categories shown in Fig. 1, including each taxon into one or several of these categories according to their potential ecological guild. Around 18% of fungal OTUs could not be assigned to any guild. For the identification of pathogenic bacteria and fungi, we compiled a list of potential human pathogens from references[24, 57-59]. Richness and evenness estimates Indexes for estimation of alpha-diversity were calculated based on Hill numbers (effective number of species)[60] as implemented in the package ‘iNEXT’[61]. For richness (total number of species) we give the asymptotic estimate (Chao1 index), and for evenness (similarity in species relative abundance in a sample) we use Pielou’s evenness index[62]. Pielou’s index varies between 0 and 1, with larger values representing more even distributions in abundance among species. This index is calculated from the asymptotic values of the Hill numbers q=0 (Chao1 richness) and q=1 (Shannon diversity, SD) as ln SD /ln  (Chao1).


Sampling methodology and DNA extraction and quantification
Samples at the 11 sites were collected simultaneously using volumetric spore traps (Burkard Manufacturing Co., England, UK). Each sample corresponds to a 7-days collection. Sampling procedures, DNA extraction and quantitation were performed as described previously in [42]. A synchronous sample collection was conducted each season along a period of two years (henceforth year A and B), starting Summer 2015 and finishing Spring 2017. A total of 87 samples were collected (1 sample was missing because of a device failure during collection in site B2, Fall A).

High-throughput DNA sequencing
High-throughput sequencing analyses were performed using the purified DNA from sample of the 16S library (G3 Winter B) was discarded because DNA amplification failed, so a total of 86 samples were analyzed for bacteria. DNA from a negative control (sample obtained with the same procedure applied in sample collection but keeping the device off) was also included in the sequencing protocol to discard sample contamination.

Sequence preprocessing
PANDAseq v2.8 [46] was used for assembling paired-ends sequences of the 16S DNA library, filtering by Q-score quality (0.6), trimming the primers sequences and excluding the sequences exceeding the length of the amplicon (min: 400 bp, max: 500 bp). For fungal sequences, the sequencing protocol exceeded the length of the amplicon, and "read_fastq" from Biopieces v2.0 (http://maasha.github.io/biopieces/) to remove the primer sequence at the end of the amplicon, followed by "fastq-join" [47] (https://github.com/brwnj/fastq-join) were applied instead to pair the sequences. Global processing of the sequences was conducted in Qiime suite environment [48](version 1.9.1, http://qiime.org). Chimeras were excluded using USEARCH v8.1 (https://drive5.com/usearch/) in default mode. OTUs clustering and taxonomic assignments were performed with the default algorithm of Qiime (pick_open_reference_otus.py), using UCLUST as method for picking OTUs [49] at 97% similarity cutoff. Silva database (release_132) for bacteria [50], and UNITE v7.0 [51] for fungi were used for the taxonomic assignments. OTUs assigned to chloroplast, mitochondria, "Unassigned" or that did not reach a defined taxonomic rank at Phylum level were filtered out. For bacteria, we conducted an additional manual revision to search for potential contaminations from insect or human-skin microbiota during manipulation, as in [28], identifying a total of 9 OTUs that were removed for further analyses.

Filtering and normalization
Since spurious OTUs with very low counts may appear due to PCR and sequencing errors [52,53], we estimated a 'noise floor' following the statistical procedure outlined in [28], which resulted in one and two counts for bacteria and fungi, respectively. As a pre-analysis step, we thus removed singletons (OTUs present with an abundance of one count in one sample and zero in the rest) for bacteria, as well as singletons and doubletons for fungi. This procedure eliminated 16,845 OTUs in bacterial samples (30% of the original table) and 3,582 in fungal samples (26%). However, they only represented between 0.3-1.5% (bacteria) or 0.2-0.29% (fungi) of the samples relative abundances.
To establish the core of bacterial and fungal genera (taxa, at the Genus level, present in at least 95% of all samples) we considered a more restrictive criterion of 'presence' taking into account experimental variability with duplicate and simultaneously running spore traps, as described in [28]. This sets a threshold of 0.032% in relative abundance for an OTU to be reliably observed in a given sample, which we used as a criterion for presence/absence.
In all calculations comparing relative abundances (as those shown in Fig. 1 and Figs. S2-S3) we corrected for biases due to differences in size between samples using cumulative sum scaling normalization [54], as implemented in the "metagenomeSeq" package.

Environmental and pathogen annotations
We collected the top 150 OTUs of every sample (covering at least 70% of relative abundance per sample, giving a total of ~2,000 OTUs) as representatives of the airborne bacterial community. The predicted sources (Fig. 1b) were assigned using the Seqenv pipeline [55], selecting the most frequent annotations from the top 5 matches (>97% similarity) associated to each OTU sequence. The resulting annotations were assigned to 7 different habitats: Animal-related (which includes gut-and skin related microbiota), Plant (phyllosphere and rhizosphere-related bacteria), water (including Freshwater, Marine or Aquatic when the annotation was not specific enough to discern the type of aquatic environment) and Soil (including sediment, sand, sludge, desert, and related expressions). Taxa with equal preference for different habitats were assigned to "Generalists", a relatively common situation for bacteria (many of which are usually isolated, for instance, from plant surfaces as well as soil samples). For those OTUs without environmental assignation but defined to species level, a manual assignment of the ecological habitat was set based on the related literature. A total of 654 (~33%) OTUs could not be assigned to any habitat (NA), mainly sequences from cultureindependent studies and incomplete taxonomic description (rank Family or higher). The same pipeline was used to assign most likely habitats to fungal taxa, including a Lichen category.
The ecological guilds for fungal OTUs (Fig. S1) were assigned using the FUNGuild pipeline [56] for the whole set of fungal taxa. We gathered the results in the eleven different categories shown in Fig. 1, including each taxon into one or several of these categories according to their potential ecological guild. Around 18% of fungal OTUs could not be assigned to any guild.
For the identification of pathogenic bacteria and fungi, we compiled a list of potential human pathogens from references[24, [57][58][59].

Richness and evenness estimates
Indexes for estimation of alpha-diversity were calculated based on Hill numbers (effective number of species) [60] as implemented in the package 'iNEXT' [61]. For richness (total number of species) we give the asymptotic estimate (Chao1 index), and for evenness (similarity in species relative abundance in a sample) we use Pielou's evenness index [62]. Pielou's index varies between 0 and 1, with larger values representing more even distributions in abundance among species. This index is calculated from the asymptotic values of the Hill numbers q=0 (Chao1 richness) and q=1 (Shannon diversity, SD) as ln /ln ( ℎ 1).

Mantel tests
A matrix of spatial distances (in Km) between the different sampling locations was obtained from the latitude and longitude coordinates of each sampling site, using the function 'distm' from package 'geosphere' with geodesic distance. The Mantel test was calculated with function 'mantel' in 'vegan' R package, using Spearman rank correlation and permutation tests (1,000 permutations) for significance.

Beta-diversity
The Morisita-Horn distance was used as it is an abundance-based measure of similarity dominated by the most abundant species, resistant to under-sampling (rare species have little effect) [63]. Since composition in our samples is dominated by few relatively abundant and pervasive genera, this distance allows a better visualization of spatiotemporal influences on the similarity of our microbial communities. Principal Coordinate Analysis (PCoA) was applied after rarefaction to minimum sampling depth and Hellinger standardization. Taxa abundances were grouped at the genus level.
Contributions to principal coordinates axis were corrected for negative eigenvalues using 'cailliez' method. PERMANOVA tests were performed with 'adonis' function in 'vegan' package, checking first for homogeneity of group variances ('permutest.betadisper' function in 'vegan'). The most abundant genera were correlated to dimensions in principal coordinates space using 'envfit' function in 'vegan' (Fig.   3a,b).

Indicator species
Species indicators of a given group of samples (e.g. a seasonal period) are characterized by an index (IndVal) between 0 and 1, which is the product of two components [64]: specificity, or abundance of the species in the group relative to its total abundance, and fidelity, or relative frequency of occurrence of the species in samples belonging to the group. IndVal indices for all bacterial and fungal genera were obtained with the R package labdsv. Significance was calculated by 10,000 randomizations of groups, followed by Benjamini-Hochberg correction for multiple testing. Only species with IndVal values > 0.4 /0.5 (for bacteria and fungi respectively) and P < 0.01 are shown in Fig. 3.

Meteorological data.
Meteorological week, with the exception of rain (total amount).

Analysis with environmental data
Factor analysis: Exploratory factor analysis was performed using the R package FactoMineR [65]. Principal component analyses (PCA) were done on the environmental matrix with standardized data, .
dbRDA: PCoA ordinations using Hellinger standardization and Morishita-Horn distance were constrained to the environmental variables using function capscale in vegan package. Variance explained by the different variables was corrected as in [66] (adjusted R 2 ). To find relevant variables significantly associated to community variation, we first applied model selection to all environmental variables using 'vegan' function 'ordiR2step'. Linear dependencies between explanatory (environmental) variables were checked calculating variance inflation factors with function 'vif.cca' in 'vegan'. We retained in our analysis the variables with values of vif below 5, excluding some of the variables with significant correlations (ozone levels, strongly correlated with temperature and NO 2 , and relative humidity), and keeping only unbiased factors with more plausible ecological meaning.

Composition and sources of microorganisms
Using spore traps [42], we collected air samples along four seasonal periods during two years. We sampled simultaneously in 11 sites scattered across a wide area within the territorial demarcation of the Community of Madrid, Fig. 1a. The sampling sites include three locations within Madrid city center (G2, B3 and B5) as well as different urban and peri-urban scenarios belonging to medium to large population size towns surrounding the central area (Methods). A total of 87 samples (7-days period each) were subjected to targeted amplicon DNA sequencing to monitor the bacterial and fungal diversity gathered in every sampling site (Methods and Table S1).
We first analyzed the composition and possible sources of the taxa present in our samples. Most of the bacterial taxa were of soil origin (~70% of assigned taxa), according to the most frequent matches of Environmental Ontology (ENVO) terms (Methods), followed by those related to aquatic environments (~14%). These potential sources agree with other studies of airborne bacterial composition in urban and rural environments [4,16,34,37,39,67]. In contrast to some of these studies, however, we found a quite low proportion of bacteria (~3%) directly associated to plants, likely because of the different approach used to assign the predicted source (Methods). The contribution of different habitats to bacterial relative abundance was rather homogeneous across sampling locations, Fig. 1b. This agrees with other works on airborne communities that found no significant differences between frequency of potential sources among rural and urban areas [34].
Fungal communities were also dominated by taxa of soil origin (~60%) and to a less extent by fungi frequently associated to plants (~20%). Fungal taxa were also classified into different ecological guilds (Methods). Many of the identified taxa (~46%) were saprotrophs, as it is the case for many fungi found in soil. Especially frequent in this group were taxa related to wood decay (  Table S2. Remarkably, these few common genera constituted a sizable fraction of all the samples (between 31%-72% of individual sample abundance, ~50% of total prokaryotic abundance). The prokaryotic core is dominated by members of Actinobacteria and Proteobacteria, which are the most common phyla found in soil [68], such as Sphingomonas, Kocuria, Pseudomonas and Paracoccus Fig. 1c-d and In particular, the accumulated precipitation during the two weeks previous to this sampling period was much higher in these locations. Cloud formation can be triggered by ice nucleation activity proteins present in several species of Pseudomonas. Thus, these bacteria can be deposited on the earth surface and increase their local abundance rapidly [72]. In addition, many Pseudomonas species are saprophytes and plant pathogens, which would favor their rise in sites with abundant plant covering.
The fungal community was dominated by Ascomycota, and to a much less extent by Basidiomycota (Fig. S3). We identified only 4 core genera (Cladosporium, Alternaria, Epicoccum and Eurotium, all assigned to Ascomycota) that made up a noticeable but very variable fraction of the eukaryotic community across all the samples (between ~4%-86% of individual sample abundance, and ~54% of total fungal abundance). This inter-sample variability of the core taxa suggests that fungal airborne communities are more sensitive to local sources or environmental factors than prokaryotic communities.
A possible explanation is that most of the prevalent fungal taxa found in our samples are soil and plant saprotrophs feeding from debris of dead plants, whose presence may be largely influenced by the climatic season and the availability of nearby sources. As for bacteria, we collected the 10 most abundant genera across all samples and calculated their distribution by sampling sites, Fig. 1e, or seasonal period, Fig. 1f. Apart from the above mentioned core genera, other fungal genera such as Penicillium or Sporobolomyces are highly prevalent (> 90% of the samples). While different sites show a rather homogeneous distribution of the main fungal taxa, seasonal periods display significant differences, with a smaller contribution of these genera in Fall/Winter samples compared to the Spring/Summer periods (Welch's test, P < 0.001).

Seasonal features modulate microbial diversity
The microbiome in the near surface atmosphere can be influenced by changes both from nearby sources and from environmental factors [1,21,40,67,73,74]. As a first characterization of diversity across space and time, we estimated two alpha-diversity indicators for each sample: number of taxa (richness, Chao 1 index [75]) and similarity in species relative abundance (evenness, Pielou's index [62]) (Methods). We then grouped these indicators by samples belonging to the same location or seasonal period.
For each location, richness exhibited a large variability depending on the sampling period, Fig. S4, which prevents to detect significant differences among sites. In contrast, Other studies, however, reported a higher bacterial diversity in different seasons [4,36,78] pointing to an influence of multiple environmental factors combined with the climatic characteristics of the region.
Fungal communities show a significant increase in richness during Fall, Fig. 2c. In contrast to bacteria, all seasonal periods exhibited marked differences in evenness, Fig.   2d, with the more dissimilar communities corresponding to the summer periods. This agrees with the results of previous surveys in the Iberian Peninsula using microscopy techniques that showed an increase in abundance and types of fungal spores during Summer and, especially, Fall seasons [79][80][81].
Analyzing seasonal diversity in different years, similar general trends are observed (Fig.   S5). However, some inter-annual variability is also apparent. In bacteria, richness is significantly different between the spring periods of both years, Fig. S5a, and fungal communities show significant inter-annual differences in richness in Fall, Winter and Spring samples (Fig. S5c).
We next studied spatial and temporal variation in species composition between samples (beta-diversity). We first investigated the possibility of spatial correlation in our data (if closer locations contain more similar microbial communities) using Mantel tests separately for each sampling period (Methods). No significant correlations were found among beta-diversity and site geographical distances for the sampling periods analyzed.
In addition, we used Mantel correlograms and distance-based Moran eigenvalue maps to check that no significant spatial structure is present in the microbial communities sampled. Then, we applied principal coordinate analysis (PCoA), as described in Methods, to visualize gradients in our samples. When taxa were grouped by seasonal period, we observed a clear separation by season in both bacterial and fungal samples, year with indval values > 0.5 (Fig. S7). They were mostly plant leaves colonizers and pathogens. Of note, some abundant genera, Fig. 1e

Influence of environmental variables on airborne microbial communities
The seasonal patterns apparent in the community composition of airborne due to the chemistry of ozone production and destruction [84]. Temperature and relative humidity showed also a significant negative correlation (Fig. S10b), which is expected from the climatic conditions of Madrid, with a mixture of cold semiarid and Mediterranean characteristics.
To assess the influence of different environmental variables on seasonal changes in community composition, we regressed the species matrix on the environmental factors using distance-based redundancy analysis (db-RDA) [85,86]. The best explanatory variables were first identified by model selection (Methods), and then included in the constrained ordinations.
Seasonal trends in bacterial diversity are most significantly explained by the temperature gradient (Fig. 4a), followed by amount of PM 10 and average wind speed.
While total rain fallen during the sampling periods was not selected as one of the main explanatory variables, it may show an influence on the composition of the Winter communities in the 'green' locations, where especially G1 and G3 sites registered an elevated amount of precipitation during this season.
Temperature was also the dominant environmental factor explaining seasonal variations in fungal communities. Unlike bacteria, these communities seem to be also especially sensitive to rain levels (Fig. 4b), while wind speed explains the trend of Winter samples, in a similar way as in bacteria. Of note, the main representative fungal genus in our samples, Cladosporium, has been found to be positively influenced by temperature [79,87,88], in agreement with its more abundant presence in Summer and Spring samples (Fig. 1f).
Regarding chemical contaminants, NO 2 levels were not significantly associated to changes in composition of bacterial or fungal communities. Ozone levels were statistically associated to compositional variations, but its effect cannot be disentangled from that of temperature. Since there is no evidence for a direct influence of ozone levels on bacteria or fungi at the concentrations present in the near surface atmosphere (maximum of ~100 µg/m 3 in some Summer samples, Fig. S9) [89,90], we ruled out this variable in our analysis.

DISCUSSION
The presence of a common abundant core of bacterial genera modulated by environmental factors resembles findings in very different ecological niches, such as the soil [68], marine [91] and gut microbiomes [92,93].  [37] show a greater diversity of habitat.
The fungal community was less diverse and dominated by a few genera of Ascomycota, corresponding to taxa commonly found in soil that dominate across different ecosystems and geographies [97,98]. These genera are also ubiquitous in the atmosphere, and have been found in places with different environmental conditions and urbanization levels [34,74,79,99,100]. The likely reasons for the global prevalence of these taxa are their wind dispersal abilities, but also their flexible trophic capabilities and the higher potential for resource utilization [97], as it is the case with members of Alternaria, which are potential opportunistic plant pathogens, or Cladosporium, also common inhabitants of organic debris.
Evaluating the proportion of variance in beta-diversity explained by location, land coverage, seasonal period and year of sampling, we clearly found a large contribution of seasonal factors driving variations in community composition across relatively large spatiotemporal scales. These seasonal shifts of airborne microbes have been observed in other studies with different sampling methodologies and environments [4, 25-27, 36, 37, 39, 101, 102]. Seasonal variations are mainly manifested as changes in the abundance of pervasive and most representative taxa, some of which are especially adapted to particular climatic conditions. This is the case of the bacterial genus Hymenobacter, whose species are found in cold ecosystems and show a strong signal in Winter. In addition, we found some taxa unequivocally associated to particular seasonal periods, especially to Summer for bacteria and Fall for fungi. In contrast to seasonal indicators, we did not find taxa significantly associated to sampling sites or land coverage. This In addition to seasonal changes, we also found a noticeable inter-annual variability among airborne fungi, with some prevalent taxa preferentially present in one of the sampling years. This could be partly attributable to inter-annual differences in meteorological conditions, particularly precipitation levels [103], and hints to a greater sensitivity of fungi to climatic drivers [98,104].
The seasonal patterns in airborne community composition can be explained in part by seasonal variations in environmental factors. Temperature had a strong effect on both fungal and bacterial diversity, followed by average precipitation levels for fungal communities. Global studies of soil fungal communities have shown that temperature and precipitation explain main variations in worldwide fungal diversity [98,[104][105][106], with stronger contribution than soil features. Thus, it is reasonable to expect seasonal variations of these two factors to be tightly linked to seasonal changes in the composition of airborne fungi. These variables can influence in different ways fungal bioaerosols: temperature can directly accelerate metabolic rates favoring organism multiplication [105,107] and also contribute to physical detachment of fungi from soil and plant surfaces [73]. Likewise, precipitation can play different roles both altering the structure of the soil and plant communities [103], the main source of airborne fungi, as well as influencing their dispersion by promoting production of conidia or spore release [73].
As for airborne bacteria, several works have reported an increase in total bacterial numbers during warm seasons [38,108] consistent with the effect of air temperature on growth rates [105,108]. This could explain the high richness and low inter-sample variability found during the summer periods. In contrast to fungi, total particulate matter (PM 10 ) is found to be significantly associated to seasonal changes in bacteria, and likely influences both Summer and Fall communities. Due to their smaller sizes, bacteria can easily attach to fine inorganic particles and be transported jointly. In fact, previous studies have found correlations between total particulate matter and the amount and diversity of airborne bacteria [78,109].

CONCLUSIONS
In summary, our study provides evidence that the urban air microbiome is dominated by a few cosmopolitan taxa frequently found in soil, with a more homogenous composition than the airborne microbiome of rural or pristine environments, or at high altitudes.
This urban community is likely assembled by emission and dispersal from nearby sources, and homogenized by typical transport processes in the boundary layer of the atmosphere. While particularities of the local sources, such as plant coverage or differences in human activity, can have an impact on short-term and spatial variability, we find that most of long-term variability is associated to seasonal and climatic changes. The present work thus contributes to our knowledge of the human exposome in metropolitan areas and to the environmental drivers responsible for its variation.

Ethics approval and consent to participate
Not applicable

Consent for publication
Not applicable

Availability of data and material
The datasets generated and/or analysed during the current study are available in the National Center for Biotechnology Information (NCBI), Sequence Read Archive (SRA), under the accession number PRJNA664957.

Competing interests
The authors declare no competing interests.    Chao1 index (species richness) and Pielou's index (relative evenness) for airborne bacteria (a and b, respectively) and fungi (c and d, respectively). Welch's-tests were performed to determine statistical differences between seasons and asterisks represent their significance: ***: P < 0.001; **: 0.001 < P < 0.01; *: 0.01 < P < 0.05.  Constrained ordinations of samples grouped by seasonal period with environmental factors (Methods). Asterisks represent significance of the environmental variables under permutation tests (1,000 permutations): *** P < 0.001; ** 0.001 < P < 0.01; * 0.01 < P < 0.05. The represented variables explain ~27% of the sample variance for bacteria and ~41% for fungi (adjusted partial variances), with temperature the most explanatory variable for both communities. The ordinations shown correspond to correlation biplots: angles between samples and arrows reflect their correlations.  Chao1 index (species richness) and Pielou's index (relative evenness) for airborne bacteria (a and b, respectively) and fungi (c and d, respectively). Welch's-tests were performed to determine statistical differences between seasons and asterisks represent their significance: ***: P < 0.001; **: 0.001 < P < 0.01; *: 0.01 < P < 0.05.  The functional/ecological guilds for fungal taxa were parsed with FUNGuild , and resulting annotations were re-assigned to the categories shown (OTUs assigned to different guilds appear in each of the corresponding categories). The bar plot shows the relative frequencies (number of OTUs) ascribed to each functional guild in each sampling location.   Figure S4. Alpha--diversity estimators across different sampling sites. Chao1 index (species richness) and Pielou's index (relative evenness) for airborne bacteria and fungi are shown in upper and lower panels, respectively. Figure S5. Alpha--diversity estimators across seasons of different years. Chao1 index (species richness) and Pielou's index (relative evenness) for airborne bacteria and fungi are shown in upper and lower panels, respectively. Welch's-tests were performed to determine statistical differences between samples of the same season and asterisks represent their significance: ***: P < 0.001; **: 0.001 < P < 0.01; *: 0.01 < P < 0.05. Fungal genera with preferential appearance in one of the sampling years. Only taxa with IndVal index > 0.5 are shown. Figure S8. Distribution of genera with potential pathogenic species across sites.  Principal component analysis (a) and scatter plots and correlations (b) of environmental variables. Samples are colored by seasonal period. Arrows in (a) represent correlations of environmental variables with the two main components. The confidence ellipses for the centroids of the different groups (Seasons) are also shown. The first two components accounted for 69% of the variance in the environmental matrix, but the third component also explained a noticeable amount of variance (13.7%). Temperature, relative humidity, ozone and NO2 levels were the factors contributing most to the first dimension, while rain and PM10 contributed mainly to the second component. While wind speed had a worse representation on these two main components, it contributed most significantly (>80%) to the third dimension. The diagonal panels in (b) show the distribution of values of environmental variables; upper triangle panels are Spearman rank correlation coefficients; lower triangle panels are scatter plots with LOESS smoothed fits.