Phyllosphere Microbiome in Response to Citrus Melanose


 Background: Like microbiomes in the rhizosphere, phyllosphere microbiomes can have an important role in plant growth and health. However, whether and how the phyllosphere microbiomes respond to the invasion of pathogens is not well understood. In this study, we address this question using the citrus phyllosphere-associated microbiome as a model.Results: Through DNA metabarcoding (16S for bacteria and ITS for fungi) and shotgun metagenomic sequencing, we found that phyllosphere microbiomes in different ecological habitats (epiphytes and endophytes) responded differently to melanose disease caused by the fungal pathogen Diaporthe citri on citrus (Citrus unshiu) leaves. We observed that citrus phyllosphere-associated microbiome responded to the melanose disease in five ways: (1) increasing microbial richness; (2) reducing community evenness; (3) enriching selected microbes; (4) enhancing microbial interactions; and (5) enriching functional features involved in metabolism and fungal cell wall degrading.Conclusions: Our study revealed how phyllosphere microbiomes in the epiphytic and endophytic habitats differ between diseased and healthy leaves. Based on the differences at both the taxonomic and functional levels, we propose a general conceptual paradigm to describe the different microbial community assembly processes for the phyllosphere microbiome in response to leaf disease and how such processes impact plant health. Our results provide novel insights for understanding the contributions of the phyllosphere microbial community response during pathogen invasion.

sequencing, we found that phyllosphere microbiomes in different ecological habitats (epiphytes and endophytes) responded differently to melanose disease caused by the fungal pathogen Diaporthe citri on citrus (Citrus unshiu) leaves. We observed that citrus phyllosphere-associated microbiome responded to the melanose disease in ve ways: (1) increasing microbial richness; (2) reducing community evenness; (3) enriching selected microbes; (4) enhancing microbial interactions; and (5) enriching functional features involved in metabolism and fungal cell wall degrading.
Conclusions: Our study revealed how phyllosphere microbiomes in the epiphytic and endophytic habitats differ between diseased and healthy leaves. Based on the differences at both the taxonomic and functional levels, we propose a general conceptual paradigm to describe the different microbial community assembly processes for the phyllosphere microbiome in response to leaf disease and how such processes impact plant health. Our results provide novel insights for understanding the contributions of the phyllosphere microbial community response during pathogen invasion.

Background
Like the rhizosphere, the phyllosphere (aboveground part of terrestrial plants) is an important niche of the plant [1]. The phyllosphere is inhabited by diverse microbes, with some microbes living on the surface of plants as epiphytes and others colonizing inside leaves as endophytes [2]. These microbes, which mostly include bacteria and fungi, form microbial communities that are associated with phyllosphere habitats. The microbial communities associated with the phyllosphere are collectively called the phyllosphere microbiome [3].
A number of studies have demonstrated that phyllosphere microbes can provide plants with nutrients, modulate plant growth, and maintain plant health through either competition with pathogens or induction of plant resistance against pathogens [4]. However, most of these studies have focused on the interactions between a few members of the phyllosphere microbiome. As a major line of defense against infectious diseases, a clear understanding of how the phyllosphere microbiome responds to infections can help us develop better biocontrol and prevention strategies against diseases. However, almost nothing is known about how the phyllosphere microbiome responds to infections at present. The rapid development of high-throughput sequencing makes it possible to explore the dynamic changes of phyllosphere microbial communities, including functional differences due to leaf infections.
In this study, we investigated and compared the phyllosphere microbiomes between diseased and healthy leaves from citrus trees under natural eld conditions. Different from most crops that have been investigated for phyllosphere microbiomes, citrus trees are perennial plants and have evergreen leaves with relatively stable phyllosphere phenotypes. Consequently, analyzing their phyllosphere microbiome could provide new insights about the stability of phyllosphere communities in perennial plants. Through metabarcoding (16S for bacteria and ITS for fungi) and shotgun metagenomic sequencing (Fig. 1), here we investigated the variations of phyllosphere microbial communities between healthy leaves and leaves infected by Diaporthe citri, a fungal pathogen that causes spot-like lesions in citrus leaves known as melanose disease (Additional le 1: Fig. S1) [5]. The aim of this study was to explore the microbial community assembly processes including microbial diversity, network structure, and functional enrichment of the phyllosphere in response to leaf disease.

Structure of the citrus phyllosphere microbiome
To explore variation in the phyllosphere microbiome when facing the invasion of pathogens, we generated a microbial community pro le for each sample using 16S and ITS metabarcoding data. In total, we obtained 6,770,888 sequences from phyllosphere samples. After denoising, removing low-quality and chimeric sequences with DADA2 [6], a range of amplicon sequence variants (ASVs) were obtained for the four sample types epiphytic bacteria, epiphytic fungi, endophytic bacteria, and endophytic fungi, with 2590, 664, 472 and 245 ASVs respectively based on 20 replicates of each (Additional le 2: Table S1).
In this study, the Richness index (Observed ASVs) and Evenness index (Pielou's Evenness) were used to assess the alpha diversity for each of the four microbiomes. Our results showed that the phyllosphere microbiomes from different ecological niches (epiphytes and endophytes) have different patterns of microbial diversity. Speci cally, epiphytes on diseased leaves had greater microbial richness for both bacteria and fungi (Kruskal-Wallis; p < 0.05) than those on healthy leaves. However, no signi cant difference was found between diseased and healthy leaves in their endophytes ( Fig. 2 and Additional le 2: Table S3). Interestingly, almost half of the total bacterial ASVs (48.6%) were found only in the epiphytic microbiome of the diseased leaves while only 14.4% were unique to the endophytes ( Fig. 3c and Additional le 1: Fig. S2c). Most of the shared bacterial ASVs (75.0%) between the healthy and diseased leaves in their epiphytes belonged to the following top ve classes: Alphaproteobacteria (26.8%), Gammaproteobacteria (20.8%), Bacteroidia (13.3%), Bacilli (8.0%) and Actinobacteria (6.3%). In contrast, 45.7% of the unique bacterial ASVs in the epiphytic microbiome of the diseased leaves belonged to these top ve classes of Alphaproteobacteria (12.4%), Gammaproteobacteria (11.6%), Thermoleophilia (9.8%), Subgroup 6 (5.9%) and Actinobacteria (5.9%). In the endophytic microbiomes, no signi cant difference of microbial richness was found between the healthy and diseased leaves. However, the community evenness of both bacterial and fungal communities in the diseased leaves were signi cantly lower (Kruskal-Wallis; p < 0.01) than those in the healthy leaves ( Fig. 2 and Additional le 2: Table S3). Interestingly, most of the differential ASVs in the endophytes of diseased leaves were those that showed depleted distributions, contrary to that in the epiphytic microbiome (Fig. 3a, d and Additional le 1: Fig.   S2a, d). Furthermore, there was no signi cant difference in the Evenness index of epiphytes between the healthy and diseased leaves for either bacteria or fungi ( Fig. 2 and Additional le 2: Table S3).
Unconstrained principal coordinate analysis (PCoA) of the Bray-Curtis matrix was used to investigate patterns of separation between microbial communities, and differential microbial analysis was used to assess the enrichment at ASVs and genus level. Our analyses found that the epiphytic and endophytic microbiomes from leaves have different microbial community assemblies. Speci cally, PCoA indicated that the bacterial communities between the diseased and healthy leaves differed signi cantly in both the epiphytic (PERMANOVA; P < 0.05) and endophytic (PERMANOVA; P < 0.01) habitats. In contrast, the fungal community assemblies did not show signi cant differences between the diseased and healthy leaves in the epiphytic habitat, while their endophytic communities exhibited signi cant differences (PERMANOVA; P < 0.001) ( Fig. 2 and Additional le 2: Table S3). Moreover, a higher percentage of bacterial ASVs than that of the fungal ASVs showed signi cant differences in epiphytes between the diseased and healthy groups (Fig. 3a, d). Among them, 118 bacterial ASVs (22% of total ASVs) were signi cantly different (Wilcoxon test; P < 0.05), among which 109 were enriched and 9 were depleted in the diseased leaves (Fig. 3a). By contrast, 27 fungal ASVs (13% of total ASVs) were signi cantly different (Wilcoxon test; P < 0.05), among which 19 were enriched and 8 were depleted in the disease leaves (Fig. 3d).
Further taxonomic analysis of 16S rRNA pro ling at the genus level showed that the enriched bacteria in the epiphytic habitat of the diseased leaves belonged to Alphaproteobacteria, Gammaproteobacteria and the other 8 classes (Additional le 2: Table S5). In comparison, all enriched bacteria in the endophytic habitat of diseased leaves belonged to Alphaproteobacteria (Additional le 2: Table S5). Moreover, Sphingomonas exhibited signi cant enrichment in both the epiphytic and endophytic habitats ( Fig. 3b and Additional le 1: Fig. S2b). Interestingly, while enriched in the endophytic habitat, Methylobacterium was depleted in the epiphytic habitat of the diseased leaves ( Fig. 3b and Additional le 1: Fig. S2b). Overall, the differentially distributed bacteria were mainly attributed to the following two functional categories: "plant growth promoting" (e.g. Bacillus, Methylobacterium and Sphingomonas) [7][8][9] and "plant metabolism-related" (e.g. Methylocella and Zymobacter) (Additional le 2: Table S5) [10,11].
In contrast, taxonomic analysis of ITS pro ling at the genus level showed that the enriched fungi in both the epiphytic and endophytic habitats of the diseased leaves belonged to four classes: Sordariomycetes, Eurotiomycetes, Dothideomycetes and Agaricomycetes (Additional le 2: Table S5). Speci cally, the genus Colletotrichum exhibited signi cant enrichment in both the epiphytic and endophytic habitats of the diseased leaves ( Fig. 3e and Additional le 1: Fig. S2e). In contrast, Alternaria and Nigrospora exhibited signi cant depleting in both the epiphytic and endophytic habitats ( Fig. 3e and Additional le 1: Fig. S2e). Functionally, most of the differentially distributed fungi belonged to "potential plant pathogens" (e.g. Colletotrichum, Alternaria and Nigrospora) (Additional le 2: Table S5) [12][13][14]. Of these, Colletotrichum [13] and Alternaria [12] contain known pathogens of citrus. Interestingly, 5/6 depleted fungi in the epiphytic habitat of diseased leaves also belonged to "potential plant pathogens" (Additional le 2: Table S5).
A large number of "Bacteria-Bacteria" interactions in epiphytes were intensi ed in diseased leaves To explore whether changes in microbial community assemblies were accompanied by changes in microbial interactions, we performed co-occurrence network analysis to uncover potential associations and complexity of connections within the phyllosphere microbiome in response of disease. This was followed by estimating the topological properties of the co-occurrence networks to identify differences between the diseased and healthy groups. The co-occurrence networks were analyzed by Spearman correlations based on quantitative data of ASVs.
Our co-occurrence network analysis revealed increased complexity in the diseased condition compared with the healthy condition ( Fig. 4 and Additional le 1: Fig. S3). A previous study showed that highly connected networks can form and strengthened when microbiomes face environmental disturbance, such as pathogen invasion [15]. Furthermore, the number of negative edges (negative interactions) of the networks in the disease group were higher than that in the healthy group (both in epiphytes and endophytes), suggesting that pathogen invasion intensi ed the microbial competition within both niches ( Fig. 4 and Additional le 1: Fig. S3). Interestingly, in addition to increasing the interaction between the bacteria and fungi, a large number of "Bacteria-Bacteria" positive interactions in epiphytes were also intensi ed in the diseased leaves, indicating that the epiphytic bacteria might strengthen interactions in response to the disease ( Fig. 4 and Additional le 2: Table S6).
Our topological analysis of networks provided further insights into the variation of interactions. In epiphytes, the network of the diseased group presented a higher number of connections per node (average degree = 6.20) than that in the healthy group (4.39), indicating a highly connected microbial community in the diseased leaves. Furthermore, when nodes from the fungi were removed, the number of connections per node among bacteria increased in the epiphytic habitat of the diseased leaves (average degree = 7.01) while that in the healthy group decreased (3.13). The results of this analyses further indicated intensi ed "Bacteria -Bacteria" associations in the epiphytic habitat of diseased leaves (Additional le 2: Table S6).
Numbers of metabolism-related and glycoside hydrolase genes were enriched in the epiphytic microbiome of diseased leaves To further explore the role of phyllosphere microbiome when facing pathogen invasion, especially on its impact on the interactions in the epiphytic habitat ( Fig. 4), we performed shotgun metagenomic sequencing of the epiphytic microbiome. The combined datasets of the diseased and healthy groups contained a total of 133.8 Gbp of raw reads. Quality ltering and removal of plant-derived reads resulted in 79.4 Gbp microbial reads. In total, 7,621,585 putative protein-coding genes were predicted from the assembly. After clustering sequences at 95% identity, 5,592,809 non-redundant genes were generated (Additional le 2: Table S2).
Of the genes retrieved from the shotgun metagenomic data, 28 to 58% were assigned to a known function (Additional le 2: Table S2). Functional annotations were obtained for approximately 53% of non-redundant genes by aligning to the KEGG database and generated 15,423 KOs (GhostKOALA Score > 10). Based on KEGG level 1 pathway annotations, KOs could be classi ed into metabolism (33.7%), genetic information processing (14.6%), cellular processes (11.2%), environmental information processing (10.2%) and organismal systems (9.7%) (Additional le 2: Table S7). Differential enrichment analysis (DESeq2; Fold Change > 2 or < -2, P < 0.05, FDR < 0.1) of the epiphytic microbiomes between the diseased and healthy leaves revealed that 1283 KOs were enriched in the diseased samples, whereas 498 KOs were depleted (Fig. 5a). Furthermore, we found that the enriched KOs belonged mainly to metabolism-related pathways. Speci cally, a large number of enriched KOs were involved in pathways of unclassi ed-metabolism (131 KOs), carbohydrate metabolism (124 KOs Table S8). Consistent with network inferences (Fig. 4), this provides further evidence that the metabolism-based interactions of epiphytic microbes were dramatically enhanced when infected with melanose disease. In addition to metabolism-related pathways, enriched KOs in the diseased group also include those involved in "Plant-Microbe and "Microbe-Microbe" signal interactions, including the pathways of ABC transporters (61 KOs), two-component system (56 KOs), quorum sensing (19 KOs) and bacterial secretion system (16 KOs) (Additional le 2: Table S9). Together, these results indicated that increased epiphytic microbiome-mediated molecular communications among microbes and/or between microbes and plants in response to disease.
Taxonomic diversity of the phyllosphere microbiome In this study, we used shotgun metagenomic data from the diseased group to compare microbial diversity between epiphytes and endophytes. Taxonomic annotation was assigned at reads level using NCBI nr database. Our analyses revealed that the epiphytic and endophytic niches had different microbial assemblies, suggesting their potential functional differences (Fig. 7).
Speci cally, bioinformatic analyses of the shotgun metagenome sequences of the epiphytic microbes revealed that 66.7, 33.1, 0.2 and 0.1% of the reads corresponded to Bacteria, Fungi, Archaea and Viruses, respectively. By contrast, Bacteria (95.2%) were the predominant group among the endophytic microbes, with small fractions of Fungi (4.4%), Archaea (0.3%), and Viruses (0.2%). The dominance of bacteria in endophytes was largely attributed to Acinetobacter (47.2%). The shared bacteria between epiphytes and endophytes in the top 10 genera by relative abundance including Acinetobacter, Bacillus, Mesorhizobium, Paenibacillus, Pseudomonas and Streptomyces. In contrast, the shared fungi in the top 10 genera by relative abundance including Aspergillus, Rachicladosporium, Fusarium and Colletotrichum (Fig. 7). Moreover, 0.29% of the fungal sequence reads in the epiphytic habitat corresponded to the pathogenrelated sequences (annotated as Diaporthe sp.), compared to 0.18% in the endophytic habitat.
De novo assembly of epiphytic bacterial genomes Next, we attempted to identify epiphytic bacterial strains that might contribute to microbiome-related disease responses by binning genomes using the metagenome-assembled genome (MAG) extraction approach (Fig. 8a). After re nement and re-assembly, 24 bins (draft genomes) had completeness > 70% and contamination < 10% (Additional le 1: Fig. S4-6). Twelve of the 24 bins were enriched in the diseased leaves (Fold Change > 2), with bin 1 (Methylobacteriaceae) being the most abundant in epiphytes (Additional le 2: Table S11). Interestingly, 4/5 Sphingomonadaceae bins (bin 3, bin 11, bin 13 and bin 19) were enriched in the diseased leaves (Fig. 8b), consistent with our metabarcoding data (Fig. 3b), and suggesting a potential role for Sphingomonadaceae bacteria in host response to pathogen invasion on the citrus leaves. All four bins of Sphingomonadaceae were enriched by > 25-fold in diseased leaves over the healthy group (Additional le 2: Table S11). Among these four bins, bin 13 had the highest quality (completeness = 90.52%, contamination = 7.46%, N50 = 16,164). The assembled genome was comprised of 260 contigs with a total length of 2.62 Mb. We named this genome CPG1, short for citrus phyllosphere genome 1. We determine the taxonomy of CPG1 using Taxator-tk [17] and PhyloPhlAn [18], both annotated the genome to Sphingomonas sp. (Additional le 2: Table S11).
To better understand the putative functions of Sphingomonadaceae in the phyllosphere, we selected CPG1 as a representative genome for subsequent analyses based on the phylogenetic tree (Fig. 8b). As reported in previous studies, Sphingomonas may be involved in inducing plant defence signaling pathways and in inhibiting the growth and virulence of pathogens, which could help reduce host plant damages in the phyllosphere [19]. Here, analyses of the binned genomes revealed that microbes enriched in diseased leaves contained more gene families associated with fungal cell wall-degrading enzymes (CWDEs), similar to results in shotgun reads (Fig. 6). Speci cally, we found that CPG1 contains 12 CAZy protein family related to fungal CWDEs, suggesting that the enrichment of the genus Sphingomonas in the diseased leaves is likely associated with antifungal functions. Furthermore, glycoside hydrolases (GH99, GH74 and GH164) were not found in any of these 24 bins, which related to the degradation of glucans or mannans. The results suggest that the enrichment of CWDEs genes in diseased leaves was likely driven by multiple microbes, emphasizing the overall role of the microbial community in response to plant diseases (Additional le 2: Table S11).

Discussion
The microbiome has long been recognized as an essential part of the crop ecosystem, which is closely linked to the growth of plants and resistance to diseases [20]. In recent years, the rhizosphere microbiome (underground part) has been extensively studied, while the phyllosphere microbiome (aboveground part) has been less studied. In addition, most plants investigated so far are either model plants such as Arabidopsis thaliana or annual crops. In this research, we used citrus trees for our investigation. These trees are perennial plants and have evergreen leaves with relatively stable phyllosphere phenotypes compared the most crops. Our research provided a systematic description on how the phyllosphere microbiome in citrus responded to invasion by the fungal pathogen Diaporthe citri [5] (Additional le 1: Figure S1) and revealed new insights into relevance of the phyllosphere microbiome to the health of a perennial plant.
In epiphytic habitat, we found that the diseased leaves had signi cantly greater microbial richness than that on healthy leaves (Fig. 2). Similar to the pattern observed for the rhizosphere microbes responding to disease invasion in other plants [21], the phyllosphere microbiome also had new microbial recruits to perform infection-response functions (Fig. 9). The signi cant increase of microbial diversity has led to a dramatic strengthening of interactions between microbes (especially among bacteria) in the diseased phyllosphere ( Fig. 4 and Additional le 1: Figure S3). Indeed, the strengthened interactions between certain microbes have been associated with improved plant health [15]. Moreover, infections have shown capable of inducing assemblage of a plant-bene cial microbiome [22]. Our observation that several growth-promoting bene cial microbes were enriched in the epiphyte of diseased leaves is consistent with previous reports (e.g. Sphingomonas, Rhizobiaceae and Coniothyrium) ( Fig. 9 and Additional le 2: Table  S5) [8,9,23]. Speci cally, our shotgun metagenomic data identi ed a large number of enriched genes related to metabolism and glycoside hydrolases in the epiphytic habitat in the diseased leaves (Figs. 5 and 6). Together, these results suggest that infection by the fungal pathogen D. citri induced new microbial recruitments and the enrichments of selected groups of microorganisms and genes associated with metabolism and antifungal activities.
Recent studies support that variations in plant microbiomes are not merely the results of their passive responses to external stresses but rather an active process involving selectively recruiting and enriching certain groups of microbes [19]. Our results highlight the essential role of epiphytes in resistance to pathogen invasion, similar to that found in the rhizosphere of several plants such as tomato and common bean [24][25][26]. In the rhizosphere, the response of microbial community assembly to pathogen infection was primarily driven by changes in root exudate [27]. Whether changes in leaf exudates caused by infection are responsible for changes in phyllosphere microbiome requires further study.
In the endophytic habitat, we did not observe obvious differences in microbial richness between the healthy and diseased leaves. However, we found that the community evenness was signi cantly reduced in diseased leaves (Fig. 2, 8). This result is similar to recent studies indicating that dysbiosis of endophytic microbiome in the phyllosphere caused the leaves to exhibit non-pathogenic symptoms of the disease [28], suggesting that the structure of the endophytic microbiome could play a key role in maintaining plant health. However, how diseases mediated by pathogenic invasions are related to dysbiosis of the endophytic microbiome remains to be further studied. Recent studies support that the endophytes constitute a second microbial layer of plant defense [29]. Our study revealed that several plant growth-promoting bacteria (PGPB) (e.g. Methylobacterium and Sphingomonas) [7,9] were enriched in endophytes in response to D. citri infection (Fig. 9, Additional le 1: Fig. S2b and Additional le 2: Table  S5). The process of responding to infections by enriching bene cial endophytic microbes is likely to be proactive, similar to that observed in the rhizosphere where the pathogen could activate diseasesuppressive functions in the endophytic root microbiome [30].
Aside from evidence for the recruitment of new microbes from the environment to the epiphytic habitat in response to D. citri infection, there is also evidence for recruitment from the epiphytic to the endophytic habitat ( Fig. 9). In our study, in the diseased leaves, Methylobacterium was depleted in epiphytes but enriched in endophytes (Fig. 3b, Additional le 1: Fig. S2b and Additional le 2: Table S5), suggesting a potential transfer. Disease-mediated microbial transfers between closely related ecological niches have been proposed for the rhizosphere habitats [31]. Methylobacterium has been widely studied as PGPB that plays essential roles in plant health and disease suppression in the rhizosphere [32]. Together, these results suggest that increased colonization of bene cial microbes might be a general response of plants to infections.
In addition to enrichment of bene cial microbes, "potential plant pathogens" were also enriched in the phyllosphere of the diseased leaves (e.g. Cyphellophora and Colletotrichum) (Additional le 2: Table S5) [13,33]. Recent studies found that attempts to explain the occurrence of diseases by identifying single pathogens often represented incomplete understanding of the true causes [34]. Instead, synergistic interactions among multiple microbial pathogens may represent the prevalent pattern in nature [35]. In our study, almost all of the depleted fungi (5/6) in epiphytic habitat of the diseased leaves were "potential plant pathogens" (Additional le 2: Table S5), further suggesting that microbial responses triggered by the invasion of one pathogen D. citri might suppress other, potentially synergistically acting groups of pathogens. This result consistent with our shotgun metagenomic evidence (numbers of fungal cell walldegrading related gene families were enriched in the epiphytic habitat) (Fig. 6).
Overall, this study indicates that the phyllosphere microbial community likely adopts multiple strategies to respond to pathogenic invasions. At present, there relative contributions to host disease resistance or microbial pathogenesis are unclear and require further studies. Interestingly, the patterns of variation among the phyllosphere microbiomes found in this study have many similarities to the rhizosphere microbiomes in response to diseases in many plants [36]. Such a broad similarity suggests that phyllosphere microbiomes most likely represent the rst line of defense against foliar diseases.

Conclusions
In this study, we explored how phyllosphere microbiomes in the epiphytic and endophytic habitats differ between diseased and healthy leaves at both the taxonomic and functional levels through metabarcoding and shotgun metagenomics approaches. We used multiple lines of evidence from "Microbial Structure to Network Interaction to Functional Annotation" to highlight the relevance of the phyllosphere microbiome for plant health. Based on the combined data, we propose a conceptual paradigm (Fig. 9) to describe the different microbial community assembly processes for the phyllosphere microbiome in response to leaf disease. Speci cally, these processes include: (i) recruiting new microbes, (ii) enhancing interactions among bacteria and (iii) enriching selected microbes in the epiphytic habitat; (iv) decreasing the microbial community evenness and (v) enriching selected microbes in the endophytic habitat; (vi) transferring and enriching microbes from the epiphytic to the endophytic habitats.

Sample collection and DNA extraction
Two groups ("Healthy" and "Diseased") of leaf samples (around 1 year old) of Citrus unshiu (> 20 years old), commonly known as tangerine and Satsuma orange, were collected from an orchard (28.64 N, 121.17 E) in Zhejiang province, China. The diseased samples were identi ed based on visual symptoms (Additional le 1: Fig. S1). In total, > 200 leaves of the two groups were collected, placed in sterile plastic bags, and immediately transported to laboratory in an ice-box and stored at − 80 °C pending further processing.
To collect the epiphytic microbiome, 6-8 leaves were rst transferred into a 250 mL conical ask containing 200 mL 0.01 M sterile phosphate buffered saline (PBS, 4 °C), and then subjected to sonication (15 min) and shaking (1 h, 200 rpm, 20 °C) to dislodge the epiphytic microbes from the leaf surface into the washing solution. The washing solution was ltered with 0.22 µm clean sterile cellulose membrane lter to capture all the microbes in the solution. The membrane was cut into pieces (2 × 2 mm) with sterilized scissors [37].
To collect the endophytic microbiome, the leaf samples washed using the procedure described above were surface sterilized by consecutive immersion for 1 min in 75% ethanol, 3 min in 1% sodium hypochlorite, and 30 s in 75% ethanol, followed with three rinses with sterile water. Subsequently, the treated leaves were freeze-dried using liquid nitrogen and homogenized using a sterilized pestle and mortar [37].
The leaf sample treated above (epiphytes or endophytes) represented one biological replicate, and was subjected to DNA extraction using the FastDNA® Spin Kit for Soil (MP Biomedical, Solon, OH, USA) according to the manufacturer's instructions. In total, four sample types were prepared, epiphytes from healthy leaves, endophytes from healthy leaves, epiphyte from diseased leaves, and endophyte from diseased leaves. For each of the four sample types, ten biological replicates were prepared for metabarcoding sequencing, and three biological replicates for shotgun metagenomic sequencing.

DNA sequencing
For targeted metabarcoding sequencing, DNA fragments were ampli ed using primers targeting the 16S rDNA V4 region for bacteria and the ITS2 (Internal Transcribed Spacer 2) region for fungi (Additional le 2: Table S1). PCR reaction was done for each sample under the following conditions: 30 s at 98 °C; 35 cycles of denaturation at 98 °C for 10 s, annealing at 54 °C for 30 s, and extension at 72 °C for 45 s; nal step 10 min at 72 °C [38].

Bioinformatics analyses
For metabarcoding data, the reads were analyzed using QIIME 2 (Quantitative Insights Into Microbial Ecology 2, version 2019.7) and its plugins [39]. Brie y, to obtain amplicon sequence variants (ASVs), we rst used DADA2 to remove low-quality reads and noises, including removing chimeras [6]. The ASVs were then assigned to taxonomy groupings based on comparisons with the UNITE database [40] for fungi and the Silva database [41] for bacteria using the q2-feature-classi er [42]. A small number of unclassi ed sequences were re-checked using BLAST with the NCBI nt database [43] to manually assign consensus taxonomy among the top hits with a minimum of 95% query coverage and 95% identity (Additional le 2: Table S4). Before the following diversity analysis, all samples were rare ed to the same number of reads. We calculated alpha diversity indices (Richness and Pielou's Evenness) and beta diversity metrics (Bray-Curtis) through the diversity plugin (q2-diversity).
For the shotgun metagenomic data, the raw reads were trimmed with the sliding window approach to generate the QC (Quality Control) reads using Trimmomatic [44]. Contamination of reads originating from the host plant were aligned to the nuclear genome of Citrus unshiu (GCA_002897195.1) [45], the plastid genomes of Citrus sinensis (NC_008334.1) [46] and Citrus maxima (NC_034290.1), and the mitochondrial genome of Citrus sinensis (NC_037463.1) [47] using Bowtie 2 [48]. After that, the concordantly mapped reads were removed to preserve the clean reads. To obtain the microbial reads and their taxonomic annotations, the clean reads were aligned against the NCBI nr (non-redundant) database [43] using Kraken 2 [49], and the reads that could not be aligned to bacteria, fungi, archaea or virus were ltered out.
The microbial reads from the epiphytic samples were pooled together for de novo assembly with MEGAHIT [50]. For the resulting contigs, ORF (Open Reading Frame) were predicted with Prodigal [51] in metagenomics mode and the non-redundant gene categories were generated using CD-HIT [52] with an identity cutoff of 95%. For gene annotation, DIAMOND [53] was used to annotate with nr database [43] and SWISS-PROT database [54]. For functional annotation, GhostKOALA [55] and DIAMOND were used to annotate with KEGG database [56] and CAZy (carbohydrate-active enzymes) database [57], respectively. Reads counts and TPM (transcripts per million) of each predicted gene were determined with Salmon [58]. Assembled metagenomic data were binned with MetaBAT 2 [59], MaxBin 2 [60] and CONCOCT [61]. Re nement and reassemble steps were then performed using the MetaWRAP [62] pipeline to combine and improve the results generated by the three binners (completion > 70% and contamination < 10%). We determine the taxonomy of each bin using Taxator-tk [17] against the NCBI nt database [43]. The bins were then quanti ed by Salmon [58] and predicted gene by Prodigal [51] with default parameters. All predicted genes from bins were annotated using dbCAN2 [63] against CAZy database [57]. A phylogenetic tree of bins was constructed using PhyloPhlAn [18] based on concatenated alignments of up to 400 ubiquitously conserved proteins and used the SGB (species-level genome bins) release [64] to assign to each genome bin its closest SGB. The thresholds of Mash distance we used to assign taxonomic labels were 0.05, 0.15, and 0.30 for species, genus, and families, respectively [64].
All statistical analyses were performed using speci c packages in R version 3.6.3 (The R Foundation for Statistical Computing, Vienna, Austria), except for the network visualization, which was carried out in Cytoscape [65]. For signi cance testing, the P-value of Kruskal-Wallis was used to compare the difference of alpha diversity. PERMANOVA (Permutational Multivariate Analysis of Variance) was used to examine whether the sample groups harbored signi cantly different microbial community (i.e. beta diversity). Wilcoxon test was used to identify the statistical signi cance in differences of the relative abundance at different taxa between the groups. At the functional level, differentially abundant KOs (KEGG Orthologs) and CAZy gene families were identi ed using DESeq2 [66].
For network analyses, co-occurrence networks of the samples were performed using the Spearman correlation matrix constructed with the igraph package to assess the complexity of the interactions in phyllosphere microbiome [67]. The high relative abundances (> 0.01%) and statistically signi cant correlations (P < 0.01, Spearman's coe cient > 0.7 or < − 0.7) among ASVs were included into the network analyses. To analyze the topology of the co-occurrence networks, we used a set of measures such as the numbers of edges, nodes and clusters, clustering coe cients, modularity, connectance, average degree, network diameter and average path length. Figure 1 A work ow for separation, DNA extraction, and sequencing of the phyllosphere microbiomes. Schematic representation of the bioinformatic tools used for analyzing the metabarcoding and shotgun metagenomic data.

Figure 2
Using alpha and beta diversity to compare the phyllosphere microbiomes between the diseased and healthy groups of leaves. Richness (observed ASVs) and Pielou's Evenness indices were conducted to characterize the alpha diversity and analyzed in Kruskal-Wallis to test for differences. Unconstrained PCoA (for principal coordinates PCo1 and PCo2) with Bray-Curtis metrics was conducted to characterize the beta diversity and analyzed in PERMANOVA to test for differences. Asterisks denote signi cant differences (*P < 0.05; **P < 0.01; ***P < 0.001) and NS denotes no statistical signi cance. Differential analysis at ASVs or genus level. a, d Manhattan plots showing ASVs enriched or depleted in the diseased group in phyllosphere microbiome. Each circle or triangle represents a single ASV. ASVs enriched or depleted in the diseased group are represented by lled or empty triangles, respectively (ASVs abundance > 0.01%, P < 0.05, FDR < 0.2). c, f Venn diagram depicting the number of ASVs identi ed in phyllosphere microbiome from the diseased and healthy group. b, e The enriched or depleted genus (if not annotated to speci c genera, using higher-level taxa to represent) in the diseased group of phyllosphere (genus abundance > 0.1%, P < 0.05, FDR < 0.2). Data bars represent the log fold-change (logFC).

Figure 4
Correlation networks for the epiphytic microbiome. Co-occurrence network based on Spearman correlation at ASVs level. Each node represents a single ASV, which was colored based on class level. A connection stands for a statistically signi cant (P < 0.01) correlation with magnitude > 0.7 (positive correlation, black edges) or < -0.7 (negative correlation, red edges).

Figure 5
Overview of functional pro ling (KEGG) in the epiphytic microbiome. Differentially expressed genes (DESeq2; P < 0.05, FDR < 0.1) at a KO (KEGG Ortholog) and pathway level (b Level 2) between diseased and healthy groups. a Enrichment and depletion of the KOs, fold change > 2 or < -2, respectively. b The number of enriched-KOs in the diseased group involved in KEGG pathway (Level 2). The circle sizes represent the number of enriched-KOs. The X-axis represents the abundance of genes (TPM, log2 transformed) involved in pathway. c Number and type of genes assigned to KEGG pathway (Level 1) in the diseased group. Data bars represent average number of gene (TPM, transcripts per million). Error bars represent 95% con dence intervals.

Figure 8
Binning of the shotgun metagenomic data from the epiphytic microbiome. a Scatterplot representing the distribution of the assembled contigs based on their GC content and abundance. b Phylogenetic tree of the draft genomes (bins, n = 24) from the MAGs (metagenome assembled genomes) data (completeness > 70%, contamination < 10%). The tree was produced from concatenated alignments of up to 400 ubiquitously conserved proteins identi ed with PhyloPhlAn and used the SGB (species-level genome bins) release to assign to each genome bin its closest SGB. Enrichment and depletion of the bins, fold change > 2 or < -2, respectively. The number of symbols correlates with the number of fungal CWDEs (cell walldegrading enzymes) based CAZy database.