Genome sequencing and comparative genomic analysis of highly and weakly virulent strains of Sclerotium rolfsii, the causal agent of peanut stem rot

Stem rot caused by Sclerotium rolfsii is a very important soil-borne disease to peanut worldwide. S. rolfsii is a necrotrophic plant pathogenic fungus with an extensive host range and worldwide distribution. It can infect peanut stems, roots, pegs and pods, leading to varied yield losses. S. rolfsii strains GP3 and ZY collected from peanut in different provinces of China exhibited signicant difference in virulence on peanut by articial inoculation test. In this study, de-novo genome sequencing of these two distinct strains was performed in the hopes of revealing the genomic basis of virulence differentiation.


Conclusions
The data provided a further understanding of the S. rolfsii genome. Detailed comparative genomic analysis provided clues into the genomic basis of its necrotrophic lifestyle and virulence diversi cation. Background Sclerotium rolfsii is a destructive soil-borne fungal pathogen. Its sexual stage, Athelia rolfsii, belongs to Basidiomycota and rarely occurs in nature; thus, its role in the life cycle of the fungus is unknown [1]. S. rolfsii infects more than 600 plant species, especially economically important agricultural and horticultural crops including peanuts, soybeans, wheat, cotton, tomatoes, potatoes, cucurbits, and onions [2,3], and is thus a pathogen of wide importance. Moreover, S. rolfsii produces sclerotia, which plays a key role in the disease cycle and has the ability to survive in soil for long periods [4]. In peanut (Arachis hypogaea), S. rolfsii can infect stems, roots, pegs, and pods and causes plant yellowing, branch wilting, and even entire plant wilting. Peanut stem rot caused by S. ro sii is also known as southern stem rot, southern blight, white mold, and Sclerotium rot [5]. This fungal disease has been reported in most peanut producing regions in the world. Loss caused by peanut stem rot estimated at 41 million US dollars in Georgia in 2011 [6]. Up to 30% yield loss were recorded in farmers' eld in India [7]. Peanut stem rot has been epidemic in China recently, caused up to 50% yield loss in hotspots, and was the most serious soilborne fungal disease in peanut [8].
Control of peanut stem rot disease is di cult because of wide range of hosts, profuse mycelium, abundant persistent sclerotia, and genetic variability of populations of S. rolfsii [4]. Currently, there are only a few resistant commercial peanut cultivars available for use [9][10][11]. Limited success was achieved in developing resistant varieties to peanut stem rot in China [12]. Normally, approaches to control S. rolfsii on peanuts include the application of fungicide and agronomic measures such as rotation with non-host crops or coverage of infected crop debris with deep plowing [13]. But these methods are still not enough for effective control of this disease.
In order to implement effective integrated management practices to control S. rolfsii on peanuts, knowledge about the genetic basis of differently virulent strains of S. rolfsii is a key component, as it is essential for host resistance assessment in a given region [14]. Earlier investigators observed differences in virulence among isolates of S. rolfsii in the USA and India [15][16][17][18]. They were classi ed as highly, moderately, and weakly virulent strains [16]. Until now, differences in virulence have not been reported among S. rolfsii strains in China. In previous research, virulence of S. rolfsii strains was found to be highly correlated with endo-PG production and growth rate [16], but the genetic basis of virulence was still unknown.
The genetic variability of S. rolfsii stains has not been documented. Correlations between pathogenic traits and genetic patterns have rarely been identi ed. To gain the relevant insights, we sequenced two S. rolfsii strains isolated from peanuts in different provinces of China. The two strains possessed similar cultural morphology and growth rate on PDA media but demonstrated different levels of virulence to peanut in inoculation tests. The ZY strain had high virulence, and the GP3 strain had weak virulence. The two strains were also similar in oxalic acid production in vivo. To decipher the basis of differences in virulence, comparative genomic analysis was conducted between these two strains. This study will be meaningful for further identifying determinants of pathogenicity as well as deepening understanding of S. rolfsii infection mechanisms.

Divergent virulence
The typical symptoms caused by these two strains on the peanut stems included unrestricted lesions on infection sites followed by tissue maceration and nally wilting. Disease severity was scored at 14 dpi and showed signi cant differences between these two strains. The disease index of ZY was 82.34, which was classi ed as highly virulent. The disease index of GP3 was 32.2, which was regarded as weakly virulent (Fig. 1a, 1b). The growth rate of these two strains was similar on PDA and showed no signi cant difference (Fig. 1c, 1d). There was no signi cant difference in the amount of oxalic acid (OA) produced by these two strains. OA was analyzed using the haloes revealed on the PDA containing bromophenol blue, and by the OA amount in the culture ltrate as analyzed by KMnO 4 titration after grown in PDB without shaking for 7 days (Fig. 1e, 1f).

Genome sequence and assembly
Genome sequencing of GP3 and ZY was performed using the long-read PacBio Sequel technology. High coverage (92.33 X for GP3 and 142X for ZY, respectively) PacBio subreads were assembled de novo using CANU [19]. Then, the assembled genomes were corrected by aligning subreads using the Arrow program with the default parameter. Finally, Pilon [20] was used to polish the resulting assemblies with Illumina data (6.42 GB for GP3 and 7.03 GB for ZY). The base accuracy of the assemblies was estimated by Illumina reads alignment. The assembled genomes of GP3 and ZY were 70.51 Mb and 70.61 Mb, respectively. The assemblies contained 27 contigs with an N50 length of 3.67 Mb for GP3, and 23 contigs with an N50 length of 3.71 Mb for ZY ( Table 1). The two strains had genome assemblies of a similar size, both slightly smaller than that of S. rolfsii Indian strain M10 (73.18 Mb). In comparison with other phytopathogenic species of Basidiomycota, the S. rolfsii strains' genomes were larger than those of Rhizoctonia solani AG1-1A (36.93 Mb) [21] and Ustilago maydis (19.61 Mb) [22], but smaller than those of Puccinia granimins (88.72 Mb) [23] and Melampsora laricipopulina (97.68 Mb) [23]. The integrity of the genome assemblies was assessed using BUSCO [24]. About 97.5% (1301/1335) and 97.2% (1298/1305) of gene groups required for the correct assembly of Basidiomycota were present in GP3 and ZY, respectively ( Figure S1). The genomes of S. rolfsii strains GP3 and ZY were deposited in GenBank under BioProject numbers: PRJNA635225, PRJNA635226, and BioSample numbers: SAMN15029893, SAMN15029894, respectively.
Gene candidates in the S. rolfsii GP3 and ZY genomes were predicted by a combined approach, and 17,097 and 16,743 genes with an average gene length of 2,013.91 bp and 2,039.76 bp were identi ed respectively (Table 1) In this study, we identi ed 356 tRNAs and 48 rRNAs in the S. rolfsii GP3 genome, and 415 tRNAs and 55 rRNAs in the S. rolfsii ZY genome (Table S1). The number of tRNAs was similar to that of G. luxurians (359) and much larger than that of P. tinctorius (195) [26]. The same number of snRNAs was identi ed in both GP3 and ZY genomes. GP3 and ZY shared a similarly low number of unique genes with 75 for GP3 and 37 for ZY distributed in 62 and 19 gene families (Table S2), respectively. These two S. rolfsii strains were highly similar to pairs of orthologs sharing on average 99.96% identity at the protein level (99.95% at the nucleotide level). Sequence comparison between two whole-genome assemblies of S. rolfsii GP3 and ZY revealed high genomic collinearity (Fig. 2). The results suggested that the two strains are closely related.

Repetitive element analysis
De novo and homology approaches were combined to identify repetitive sequences in the S. rolfsii GP3 and ZY genomes. A total of 14.75% and 14.66% repetitive sequences were generated for GP3 and ZY, respectively ( Table 1, Table S3). The abundance of repetitive sequences was similar between these two strains and much more than that of S. rolfsii strain MR10 [27], which had a repetitive sequence content of 3.73%. GP3 and ZY contained repetitive elements including DNA transposons, retroelements, and satellites. Retroelements were abundant in the studied genomes, accounting for 10.28% and 10.79% in GP3 and ZY, respectively. LTR was abundant in the retroelements, accounting for 9.85% and 10.38% in GP3 and ZY, respectively (Fig. 3, Table S4).

Orthology analysis and phylogenetic analysis
The entire sets of predicted proteins of S. rolfsii GP3 and ZY were clustered with the OrthoMCL program [28] to identify gene families. Comparative analysis of the genomes of related species of Agaricomycetes, Basidiomycota showed that S. rolfsii strains had larger genomes but fewer total genes in comparison to most of the other species ( Figure S6). It was supposed that S. rolfsii strains may undergo genome expansion during evolution. Of these families, the unclustered gene numbers of GP3 and ZY were the least among fungi in Agaricomycetes. A Venn diagram of the OrthoMCL revealed that S. rolfsii strains shared 4813 genes with other four Agaricomycetes species (Fig. 4a).

Homologs in PHI base
In total, we identi ed 4,600 and 4,603 potential pathogen-host interaction (PHI) genes belonging to 995 and 991 gene families (Fig. 5a). "Virulence" category was dominant in PHI genes, followed by "Pathogenicity" type ( Fig. 5b). Compared with S. rolfsii GP3, a total of 45 PHI genes were unique in ZY, 13 of which belonged to category of "reduced virulence". Besides these unique PHI genes, several PHI genes exhibited great expansion in the ZY genome. For instance, ZY contained more PHI 211 and PHI 2020 genes than GP3. We also found 58 PHI genes in GP3 were not present in the ZY genome, 12 and 18 of which belonged to categories of "loss of pathogenicity" and "reduced virulence", respectively (Table S5).

CAZymes
Plant cell wall carbohydrates formed a complex network of different polysaccharides that are subdivided in cellulose, hemicellulose, and pectin. These polysaccharides, together with others, are targets of the carbohydrate-active enzymes (CAZyme) that degrade them into simple monomers that can be used as nutrients. The genomes of S. rolfsii GP3 and ZY contained 957 and 925 genes encoding putative CAZymes, distributed in 118 and 119 CAZyme families, respectively. The CAZyme content of GP3 was slightly larger than that of ZY, and CAZyme content of both GP3 and ZY was more than that of S. rolfsii MR10 (902). Glycoside hydrolases (GH) were dominant in the GP3 and ZY genomes (51.62% and 52.54%, respectively), followed by carbohydrate-binding modules (CBMs) and glycosyltransferases (GTs) (Fig. 6a).
Comparison of CAZyme content of S. rolfsii strains with other plant pathogens including six saprotrophic fungi (Aspergillus niger, Bortytis cinerea, Pecillium digitatum, Sclerotinia sclerotium, R. solani, and Verticillum dahliae), three hemitrophic fungi (Colletotrichum higginsianum, Fusarium granimearum, and Magnaporthe oryzae), and three biotrophic fungi (P. granimis, Peronospora effusa, and U. maydis), to determine whether the life style of these fungi correlated with their CAZyme content and distribution. The CAZyme content of S. rolfsii genome was the highest among above pathogens analyzed (Fig. 6b). Necrotrophic fungi had more CAZymes than biotrophic and hemibiotrophic fungi. In comparison with other necrotrophic plant pathogens with a broad host range, such as S. sclerotinia, B. cinema, and V. dahliae, the CAZyme content of S. rolfsii was three times as much as these fungi. Compared to Basidiomycota plant pathogens, CAZyme content of S. rolfsii was also three times as much as R. solani and P. graminis, and four times as much as U. maydis. The high expansion of CAZymes in S. rolfsii suggested that CAZymes might play an important role in making nutrients accessible for S. rolfsii growth and may be a cause of its broad host range. Besides differences in CAZyme content, the number of CAZymes involved in cellulose, hemicellulose, and pectin degradation of S. rolfsii strains GP and ZY was also different from that of other pathogens analyzed (Tables S6-S8). The number of those CAZymes in S. rolfsii strains was noticeably larger than that in other pathogens analyzed, especially in the pectin degrading capacity. This suggested that cellulose, hemicellulose, and pectin-degrading enzymes expanded in the S. rolfsii genome compared to other pathogens. The S. rolfsii genomes investigated thus far harbored the largest number of carbohydrate-active enzymes, which suggested one possible reason that the Sclerotium genus succeeded as a necrotrophic fungus, namely, that S. rolfsii has a powerful capacity to break down the defense of plant cell walls and infect its hosts.
Glycoside hydrolases are known to catalyze the hydrolysis of glycosidic bonds in carbohydrate molecules. S. rolfsii was enriched in one glycosyl hydrolase family, GH28, a class of polygalacturonases involved in pectin degradation. The amount of GH28 was the same in GP3 and ZY (62 vs 62) and was signi cantly larger than the amount found in the other pathogen species analyzed (Table S8). The expansion of GH28 was not found in the biotrophic and some hemibiotrophic pathogens, such as U. maydis, P. graminis, and M. oryzae. Generally, necrotrophic plant pathogens possess more GH28 enzymes than biotrophic and non-pathogenic fungi [32]. In comparison with the other necrotrophic pathogenic fungi analyzed, S. rolfsii strains had three times more GH28 than others. Besides GH28, other glycoside hydrolases involved in pectin degradation in S. rolfsii, such as GH35, GH51, and GH78, also expanded in comparison to other pathogen species.

Secretome and effector
To establish a successful infection and evade plant defense responses during colonization, plant pathogens secrete proteins and other molecules, collectively termed effectors, to various host compartments [33,34]. A subset of secreted proteins from pathogens is expected to determine the progress and success of the infection [35].
Secretory proteins play crucial roles during the early infection of pathogenic fungi. The putative secreted proteins of S. rolfsii GP and ZY were identi ed based on a comprehensive pipeline ( Figure S7). The genomes of GP3 and ZY were predicted to encode 536 (3.14%) and 551(3.29%) secreted proteins, respectively. In relation to Basdiomycota pathogens, S. rolfsii had a smaller secretome than that of R. solani AG1 (9.17% of the proteome), and those of Puccinia (6.4%) [36] and Ustilago (6.2%). In general, necrotrophs that thrive on dead or dying plant cells encode high numbers of secreted proteins, but this phenotype did not appear in S. rolfsii GP3 and ZY.
The CAZyme content of S. rolfsii GP3 was larger than that of ZY as above mentioned, but the CAZyme content in the secretome of GP3 (30) was much less than that in ZY (151) (Fig. 6c, Table S9). Moreover, some CBMs and polysaccharide lyase (PL) families, which putatively degrade cellulose and hemicellulose, were found in the ZY secretome but not in GP3. In addition, ZY had more secreted GHs (113), which degrade cellulose, hemicelluloses, pectin, and CE, than GP3 (20) (Fig. 6d). The differences in CAZyme content in the secretome could provide clues to unravel the difference in virulence between S. rolfsii GP3 and ZY.
Effector proteins have been observed to be secreted by several necrotrophic microbial pathogens [37] and induce disease in their hosts. A total of 50 and 46 putative effector candidates for GP3 and ZY, respectively, were predicted by Effector P.1. After manual inspection and removal of candidates with known plant cell wall degrading catalytic domains, including CAZymes, peptidase, lipase, and peptidases, and with the criteria of 50 ≤ molecular weight ≤ 300 kDa, 0-1 predicted trans-membrane domain, and ≥ 4 cysterine residues, a total of 20 and 19 effector candidates for GP3 and ZY were identi ed (Table 2), respectively. Most of the putative effector candidates were small (average length of 170 and 139 amino acids, ranging from 121 to 278, and 58 to 272 amino acids for GP3 and ZY, respectively). These candidates were rich in cysteines (the average cysteine composition was 9.11% (GP3) and 8% (ZY)). Most of the effector candidates (60.0% and 73.7.1% of GP3 and ZY) with function were unknown, respectively. A total of 15.0% of these candidate proteins had no hits in the NR database. The function of these predicted effectors needs to be further veri ed in future research.

Discussion
Sclerotium rolfsii is a very important necrotrophic plant pathogen with a broad host range. To date, only one strain's genome has been sequenced, and there is little information on its virulence [27]. In the present study, we discovered two S. rolfsii strains that differed in virulence in peanuts after inoculation. These strains did not show signi cant differences in growth rate and oxalic acid production. Therefore, we conducted genome sequencing of the two S. rolfsii strains and produced gapless high-quality genomes in hope of unraveling the genomic basis underlying the differences in virulence between the two strains.
During pathogenesis, S. rolfsii may produce cell wall degrading enzymes such as endo-polygalacturonase (endo-PG) [38,39], cellulose [40], and polymethylagalacturonase [16] in conjunction with oxalic acid [38,16]. Bateman and Beer [38] suggested that oxalic acid, pectinase, and cellulase act synergistically in the destruction of host tissue by S. rolfsii. Secretion of oxalic acid and endo-PG concomitantly with rapid mycelial growth appeared to be the key requirement for establishing infection [16]. Earlier investigators observed differences in virulence among isolates of S. rolfsii [16,41,42] and found that virulence was highly correlated with endo-PG production and growth rate, provided a base level of oxalic acid was produced [16]. S. ro sii is renowned for its ability to acidify its environment through the secretion of organic acids. OA was reported to have a positive correlation to the virulence of S. ro sii isolates [42]. In contrast, OA was found to be not correlated with the virulence of S. rolfsii by Punja (1985) [16], who found that highly, moderately, and weakly virulent strains all produced similar amounts of oxalic acid. To investigate whether oxalic acid played an important role on virulence differentiation in S. rolfsii strains GP3 and ZY, we tested these two strains on the PDA plate containing bromophenol blue and measured the amount of OA produced in a liquid PDB medium by KMnO 4 titration. The results indicated that there was no signi cant difference in oxalic acid production between the weakly virulent strain GP3 and the highly virulent strain ZY, although oxalic acid is an essential virulence factor for S. slcerotiotum [43]. S. rolfsii produced a basic level of oxalic acid to acidic environment that facilitates the optimal activity of certain sets of cell wall degrading enzymes and peptidases, it was not the key factor for virulence differentiation between S.rolfsii GP3 and ZY.
Plant cell walls are an important barrier that plants use to protect themselves from attacking by a range of organisms. Plant cell wall carbohydrates form a complex network of different polysaccharides that can be subdivided into three categories: cellulose, hemicellulose, and pectin. Plant pathogenic fungi employ diverse gene repertoires, including carbohydrate-active enzymes (CAZymes), to invade host plants and subvert host immune systems [44,45]. CAZymes are known to play an important role in hostpathogen interactions and, along with effectors, are prime targets for studying virulence factors in fungi [46,47]. CAZyme families with potential roles in virulence were examined in detail in S. rolfsii strains GP3 and ZY. In our study, the CAZyme content in weakly virulent strain GP3 was found to be slightly more than that in highly virulent strain ZY. GP3 also had a noticeably higher number of enzymes in the AA, CBM, CE, and GH families (Table S11). GP3 and ZY had a similar number of enzymes involved in cellulose, hemicellulose, and pectin degradation (Table S12), these results indicated that CAZyme content was not related to the difference in virulence between S. rolfsii GP3 and ZY. We then undertook further analysis of the secreted CAZymes, which were involved in plant cell wall or fungal cell wall degradation and played an important role in phytopathogenic penetration of their hosts [48]. There was a signi cant difference in the levels of secreted CAZymes between GP3 and ZY. Highly virulent strain ZY possessed three times more secreted CAZymes (105) than weakly virulent strain GP3 (30). ZY also possessed more enzymes involved in pectin degradation, such as GH28. These results indicated that secreted CAZymes, especially polygalacutronases, may play an important role in virulence differentiation between S. rolfsii strains. This was in accordance with the results of Punja (1985) [16] who reported that the virulence of S. rolfsii was highly correlated with endo-PG production.
To establish infection, fungal plant pathogens secrete effector molecules that manipulate host physiology, including immune responses that are triggered when plant hosts sense invading pathogens [49][50][51]. Effectors have been discovered in multiple plant pathogenic fungi and exhibit numerous different functions depending on fungal lifestyle. Necrotrophic fungi, which require dead tissue on which to feed, often produce effectors that promote cell death, whereas biotrophic fungi, which require living tissue, produce effectors that prevent cell death [52][53][54][55]. In some soil-borne vascular necrotrophic pathogens that infect a broad range of host plants, effectors involved in virulence have been identi ed. In S. sclerotiorum, about 70 effectors have been identi ed [56], a small, cysteine-rich secreted protein with a cyanoviron-N homology (CVNH) domain, attenuated virulence when deleted [57]. A total of 127 putative effectors were identi ed in another broad host range necrotrophic pathogen V. dahliae VdLs17 strain [58]. Among them, the Vd2LysM effector was reported to contribute to host-speci c virulence [59], and VdCP1 contributed to virulence and triggered the host plant's immune system [60]. Up to now, little experimental evidence for the existence of similar effector proteins was available for S. rolfsii. To identify putative effectors involved in virulence, we searched the whole proteome of S. rolfsii and found that the effectors of GP3 and ZY were completely different. S. rolfsii existed as a multi-nuclear heterokaryon, in which individual cells may carry multiple nuclei. The method for the stable transformation of S. rolfsii has not been available yet, and thus functional testing of pathogenic candidate genes in further studies will be challenging.
Despite the variety of pathogenicity-related mechanisms involved, accumulating evidence indicates that necrotrophic plant pathogens interact with their hosts in a manner much more subtle than originally considered and that signaling between them plays a signi cant role in the lifestyle of these pathogens [61]. The mechanism of differences in virulence is complicated in plant pathogens. Besides secreted CAZymes and effectors that participate in virulence, other factors may also be involved in virulence to host plants. It was reported that the genomic islands might contribute to the expanded genetic diversity and virulence of V. dahliae [62]. Virulence-associated effectors were often found to have been affected by both repeat activity and repeat-induced point mutations (RIP) in Leptosphaeria maculans and S. sclerotiorum [63,64]. This work has provided important clues to factors involved in virulence differentiation among these two S. rolfsii strains. The data presented here will provide a useful foundation for further studies to explore the basis for pathogenesis, with the goal of understanding the mechanism of S. rolfsii infection.

Conclusions
We generated long-read PacBio reads and gapless genome assemblies of two S. rolfsii strains with different levels of virulence to peanut and then implemented a comparative genomic analysis of these strains. The genome of S. rolfsii encoded a high number of cell wall-degrading enzymes. The high number of genes of these categories most likely re ects the saprophytic lifestyle of S. rolfsii, which commonly feeds on the dead organic matter of plant origin. The obtained GP3 and ZY genome assemblies and annotation represent the few available Sclerotium genome resources for studying the pathogenic mechanism of this fungus toward peanut plants.

Methods
Isolates and oxalic acid production Scleotium rolfsii strains ZY and GP3 were originally collected from Henan and Guangxi provinces of China, respectively. These two strains were in different mycelial compatibility group (MCG) and exhibited similar growth rate on potato dextrose agar (PDA medium: 200 g peeled and sliced potatoes boiled for 20 min, 20 g dextrose, adjusted to pH 7.0, 20 g agar, to make the nal volume 1000 ml with distilled water) [65]. Oxalic acid production of S. rolfsii was detected by two methods. PDA containing bromophenol blue was used to test oxalic acid produced in PDA plate. Mycelium discs of each strain were placed in the center of PDA medium containing 0.5 g/l of bromophenol blue and kept at 30 °C in the dark, four petri dishes for each strain. The diameter of yellow halo was measured after three days. KMnO 4 titration was used to detect oxalic acid produced in liquid PDB. The strains grew in liquid PDB medium, three replicates of 150 ml asks containing 30 mL of medium were included for each isolate, three discs per ask were added and the asks were incubated without shaking at 30 °C in the dark. The culture of each strain was ltered through a Whatman No.1 lter paper after 5 days incubation. Oxalic acid (OA) content in 5 mL ltrate was determined using a KMnO 4 titration method following the procedure of Kritzman's [66].

Pathogenicity test
The experimental design was a randomized complete block with three replications. Plots consisted of three rows with a row length of 2.5 m and row space of 0.33 m. The peanut variety Zhonghua 21 was planted in all trials (15 plants per row) in plots. The plants were inoculated 50-55 days after sowing. S. rolfsii inoculum was prepared just before inoculation. Oat grains were soaked in water for 4 h, sterilized at 121℃ for 30 min twice after water removed. The fresh mycelium discs of S. rolfsii GP3 and ZY were transferred to the asks containing sterilized oat grains, respectively. The oat grains culture maintained in the dark at 30℃ until surface of grains covered by S. rolfsii mycelium. The oat grains inoculum was mixed with equal amount of sterilized sand to ensure uniform delivery of inoculum. Each plant was inoculated with 2 g of S. rolfsii oat inoculum and sand mixture. The plots were watered to eld capacity after inoculation. Disease symptoms were investigated 14 days after inoculation. A 1-5 scale for the severity of wilting according to Shokes' method [67], where 1 = no symptom, 2 = lesions on stem only, 3 = up to 25% of the plant wilting, 4 = 26-50% of the plant wilting, and 5 = > 50% of the plant wilting. Disease index was calculated by using the following formula. DI = {[Σ(number of plants × corresponding diseases scale)] / (total number of plants × maximum disease scale)} × 100. Different level of virulence was determined according to Punja (1985) [16], high virulence with DI more than 66.7, and weak virulence showing DI less than 33.3.

DNA and RNA puri cation
To prepare the genomic DNA and RNA for sequencing, the GP3 and ZY isolates were cultured on PDA plates overlaid by cellophane lms and maintained in the dark at 30 °C for 3-4 days. Mycelia were collected and grounded for DNA and RNA extraction. High-molecular-weight genomic DNA for singlemolecule real-time (SMRT) was extracted using the SMRTbellTM Templated Prep Kit 1.0 (PACBIO). The genomic DNA for Illumina sequencing was extracted using a CTAB method as previously described [68].
Total RNA was extracted from mycelia using the TRIZOL Kit (Invitrogen, Carlsbad, CA, USA) following the manufacture's protocol.

Genome sequencing and assembly
For PacBio Sequel genome sequencing, high molecular weight genomic DNA (20 µg ) was random sheared with Covaris-g-Tube with a goal of DNA fragments of approximately 20 kb and end-repaired according to the manufacturer's instructions. A blunt-end ligation reaction followed by exonuclease treatment was performed to generate the SMRT Bell template, then the library was quali ed and quanti ed using an Agilent Bioanalyzer 12 kb DNA Chip (Agilent Technologies, Santa Clara, CA, USA) and a Qubit uorimeter (Invitrogen, Carlsbad, CA, USA). SMRT Bell cells were sequenced using the PacBio Sequel sequencing platform (Nextomics Biosciences, Co., Ltd., Wuhan, China). After adaptor removed and low quality reads ltered out, a total of 9, 97 Gb with 8.80 kb average subreads were generated for ZY and 6.34 Gb with 10.68 kb mean subreads for GP3, respectively.
For Illumina sequencing, about 100 µg of genomic DNA were sheared to ~ 180 bp using a Covaris LE instrument and adapted for Illumina sequencing on Illumina Hiseq Xten platform (San Diego, CA, USA) by NextOmics Biosciences. Illumina short reads were trimmed using Trimmomatic version 0.36 [69], a total of 6.42 Gb and 7.03 Gb clean data were yielded for GP3 and ZY, respectively.
The cDNA libraries were prepared by Illumina TreSeq RNA Sample Preparation Kit (Illumina, Inc., San Diego, CA, USA) and validated according to Illumina's low-throughput protocol. The RNA-seq was conducted on an Illumina HiSeq 2500 Platform with 150 bp paired-end strategy.
A de novo genome assemblies of ZY and GP3 were generated with the PacBio Sequel reads using CANU pipeline (v1.5) [19] with default setting. The assemblies were adjusted using Arrow program, and polished using Illumina reads by Pilon [20]. Finally, the integrity of assemblies was evaluated using BUSCO [24].

Genome annotation
Gene predication was performed by using a combination of ab initio-based and homology-based methods. To aid gene annotation, we generated transcript assemblies based on RNA of GP3 and ZY, respectively. For ab initio-based prediction, Augustus v2.4 [73] and Genscan (version 1.0) [74] were used to de novo predict protein coding genes with the default setting. Exonerate [75] was used to predict the gene structure with RNA-seq data. For homology-based predication, GeneWise [76] was used to predict protein coding genes by homology analysis with known protein sequences from six related species of Basidiomycota, including Galerina marginata, Gymnopus luxurians, Hydnomerulius pinast, Jaapia argillacea, Piloderma croceum, and Plicaturopsis crispa. EvidenceModler (EVM) was used to compute the weighed consensus gene structure annotation [77]. The nal gene sets were obtained after removed genes with TE transposable elements by Tranposon PSI [78].
The predicted gene sets of S. rolfsii GP3 and ZY were functionally annotated based on similarity comparison with homologous in public databases. BLASTP was used to align the protein sequences by automated searches in NCBI-NR, Swiss-Prot (http://www expasy.org/sprot/), KEGG, GO and KOG database with E-values ≤ 1e-5. Gene function domain annotation was conducted by InterProScan program [79]. The pathway analyses were conducted by KAAS-KEGG Automatic Annotation Serve [80].
The candidate non-coding RNA (ncRNA) was annotated by two approaches, BLAST was used to align the S. rolfsii genome against the Rfam database [81], and tRNA scan-SE [82] and RNAmmer [83] were used to predict tRNAs and rRNA, respectively.

Phylogenetic analysis and synteny analysis
The phylogenetic tree of S. rolfsii GP3, ZY and the above related nine species of Agaricomycetes was constructed by single copy gene based on the orthologous gene families analysis. Mafft [84] software was conducted to align the protein sequence of the single copy gene, and converted to coding sequence alignment. Gblocks [85] was used to extract the well-aligned regions of each coding sequence alignments. RAxML 8.2.12 [86] was carried out to generate the maximum-likelihood tree with 100 bootstrap replicates with Psilocybe cyanescens as an outgroup. The whole genome aligner Murmer 3.06 [87] was used for comparative analysis of the assemblies of GP3 and ZY. Dot plots between contigs of GP3 and ZY were created by MuMerplot programs from the MuMmer package.
Identi cation of the pathogenicity related genes The S. rolfsii GP3 and ZY protein sets were used to conduct a BLASTP search against PHI base (a database of Host-Pathogen gene interaction) with e-value less than 1e-5 to identify pathogenicity genes.

Secretome and effector predication
The prediction of the S. rolfsii strains GP3 and ZY secretome was conducted based on the following pipeline. SignalP version 4.0 [90] was used to analyze signal peptide and cleavage sites of S. rolfsii GP3 and ZY proteins. Candidate proteins with signal peptide were identi ed by Protcomp 9.0 (http://www.softberry.com/berry.phtml) using the LocDB and PotLoc DB databases and proteins predicted as extracellular or unknown were kept for next analysis. The candidate proteins were conducted by TMHMM version 2.0 [91] to identify protein with transmembrane domains, and all proteins with 0 TM or 1 TM, if located in the predicted N-terminal signal peptide, were kept. The candidate proteins that harbored a putative glycophosphatidylinositol membrane-anchoring domain were identi ed by GPIsom (http://gpi.unibe.ch/) [92]. The remaining proteins without GPI-anchor were predicted with Target P [93], Table 2 Putative effectors of S. rolfsii GP3 and ZY.     Pathogen-host interaction (PHI) genes of S. rolfsii GP3 and ZY. a Distribution of S. rolfsii PHI genes in different phenotypes; b Percentage of different phenotypes in total PHI genes of GP3 and ZY.