Genomic Comparisons of Two Armillaria Species with Different Ecological Behaviors and Their Associated Soil Microbial Communities

Armillaria species show considerable variation in ecological roles and virulence, from mycorrhizae and saprophytes to important root pathogens of trees and horticultural crops. We studied two Armillaria species that can be found in coniferous forests of northwestern USA and southwestern Canada. Armillaria altimontana not only is considered as a weak, opportunistic pathogen of coniferous trees, but it also appears to exhibit in situ biological control against A. solidipes, formerly North American A. ostoyae, which is considered a virulent pathogen of coniferous trees. Here, we describe their genome assemblies and present a functional annotation of the predicted genes and proteins for the two Armillaria species that exhibit contrasting ecological roles. In addition, the soil microbial communities were examined in association with the two Armillaria species within a 45-year-old plantation of western white pine (Pinus monticola) in northern Idaho, USA, where A. altimontana was associated with improved tree growth and survival, while A. solidipes was associated with reduced growth and survival. The results from this study reveal a high similarity between the genomes of the beneficial/non-pathogenic A. altimontana and pathogenic A. solidipes; however, many relatively small differences in gene content were identified that could contribute to differences in ecological lifestyles and interactions with woody hosts and soil microbial communities.


Introduction
In recent decades, the genetics of fungal pathogenicity and symbioses have been studied in concert to identify potential patterns related to these divergent ecological roles. Genomic comparisons among pathogenic and symbiotic fungi have highlighted great diversity in gene content, genome size, repeat content, and number of chromosomes among fungi with distinct ecological roles (e.g., [1,2]).
Within the Agaricales of basidiomycota, 13,000 species have been described and ecological roles and lifestyles among these range from saprophytes to pathogens to ectomycorrhizal symbionts [3]. Evolutionary studies suggest that ectomycorrhizal lifestyle likely arose from saprophytic fungi [4]. Historically, ectomycorrhizal (ECM) fungi were thought to have reduced numbers of genes encoding plant cell wall-degrading enzymes (PCWDEs), including carbohydrate degrading enzymes. Recent literature has suggested that ECM fungi contain diverse repertoires of these genes encoding PCWDEs [5]. ECM fungi typically retain the distinct suites of PCWDEs and carbohydrate-active enzymes (CAZymes) of their saprotrophic ancestors; some, like the fungal CAZymes acting on pectin (GH28, GH88, and CE8), are expressed in ECM fungi on ectomycorrhizal root tips [5]. Furthermore, little evidence suggests that common gene repertoires exist among ECM fungi [1]. The genome of Laccaria bicolor, a well-described ECM fungus, contained twice as many secreted CAZymes (glycoside-hydrolases (GH), polysaccharide lyases (PL), and carbohydrate esterases (C)) than polysaccharide biosynthetic and modifying enzymes [6]; however, this pattern was also observed in the forest root pathogen, Heterobasidion annosum [7,8]. Similarly, both mutualistic and parasitic species of Agaricomycotina typically have an abundance of transposable elements [9]. Thus, differences in genomic content of Agaricales fungi with divergent ecological roles (e.g., saprophytes, pathogens, mycorrhizal symbionts, biological control agents) are difficult to detect.
The genus Armillaria (Basidiomycota, Agaricales) includes species that have diverse ecological roles within their respective environments. Several species are important plant pathogens of trees/woody plants, while several species are symbionts or even hosts of other organisms [10][11][12]. Armillaria species are also important decomposers in the forests where they occur, especially because they can degrade lignin [11]. Among Armillaria species, A. solidipes (formerly North American A. ostoyae) is considered one of the more virulent pathogens [13], although virulence varies depending on isolate, host age, and other factors [14]. Armillaria mellea and A. borealis are also considered virulent pathogens, while A. gallica, A. cepistipes, A. gemina, A. calvescens, A. sinapina, and A. nabsnona are considered less virulent or secondary pathogens [10,12,15,16]. A recently described species, A. altimontana, formerly North American biological species (NABS) X, is also usually considered as a weak pathogen [17], but evidence for pathogenicity is not well documented [18].
Armillaria solidipes (as A. ostoyae) and A. altimontana have been documented to co-occur within forest stands in the inland northwestern USA [18][19][20]. A previous study in northern Idaho, USA, provided evidence that A. altimontana can provide natural biological control of Armillaria root disease of western white pine (Pinus monticola) caused by A. solidipes. In this study, A. solidipes was uncommon in areas dominated by A. altimontana, and trees colonized by A. solidipes were associated with a lower growth and survival than trees colonized only by A. altimontana. The results demonstrated that A. solidipes and A. altimontana have two different and contrasting lifestyles: A. altimontana was not harmful to western white pine within the northern Idaho planting site and further suggests that A. altimontana behaves as a longterm, in situ biological control agent against virulent A. solidipes [20]. However, little is known about site factors or differences in microbial communities that enabled A. altimontana to outcompete A. solidipes within specific areas of this site. Recognizing the genetic or underlying soil factors that drive host-fungal interactions may provide approaches for enhancing the management of Armillaria root disease.
The distribution, life cycle, pathogenicity, and evolutionary relationships have been studied for several Armillaria species [10,12,14,16,[21][22][23][24][25]. Collins et al. [26] studied the genome and proteome of A. mellea, identifying carbohydrate-degrading enzymes, laccases, and lignin peroxidases among other gene-encoded proteins. Ross-Davis et al. [27] characterized the transcriptome of an A. solidipes mycelial fan infecting grand fir (Abies grandis), finding high expression of transcripts coding for PCWDEs, along with enzymes and ABC transporters that may help detoxify host-produced defense compounds. More recently, genomes of four Armillaria species (A. cepistipes, A. gallica, A. ostoyae, and A. solidipes) were sequenced by Sipos et al. [28]; this comparative genomic study revealed a rich repertoire of PCWDEs and pathogenicity-related genes in these Armillaria species regardless of their ecological lifestyles ranging from an aggressive pathogen to an exclusive decomposer. This study also identified expression of numerous pathogenicityrelated transcripts and proteins during fruiting body and rhizomorph development [24,28]. However, no studies have examined genomic differences between A. altimontana and A. solidipes, which may provide insights into genomic signatures of ecological lifestyles of fungi within Agaricales.
Assessing interactions among the soil fungi with different ecological lifestyles within the microbial communities is critical to understand disease development. As a well-known example, the association between mycorrhizal associations of fungi and roots allows for increased water and nutrient uptake that sustain tree health [29][30][31][32][33]. Microbes also breakdown litter and forest debris, which maintain forest health by improving soil quality and recycling nutrients that are required by plants [34][35][36][37][38]. In addition, pathogenic soil fungi function as selective agents that can cause mortality to maladapted trees, increasing the vigor and relative adaptation of residual trees in the stand [39][40][41]. In contrast, highly virulent soil pathogens can infect healthy tree roots, resulting in tree mortality that ultimately degrades the health of forest stands [41]. Additionally, the ability to favor beneficial microbes that inhibit root pathogens, which are notoriously difficult to mitigate, may enhance current management techniques [42,43].
Soil metagenomics or metabarcoding can be used to identify important fungi, bacteria, and archaea associated with tree health [43]. Determinations of the soil microflora can allow evaluations of treatment (e.g., soil amendments, forest thinning, underburning) influences on naturally occurring microbial communities that favor or suppress forest root diseases/pathogens, which could provide new approaches to manage Armillaria root disease [20,44,45].
Herein, we elucidate genomic and/or microbial soil profile differences associated with two Armillaria species that display divergent ecological lifestyles. We first present the genome assembly and annotation of the potentially beneficial A. altimontana isolated from northern Idaho, northwestern USA, and compare it with the genome of a pathogenic A. solidipes isolated from the same region. In addition, we compare the fungal and bacterial communities in the soil associated with the two Armillaria species. Putative secreted and non-secreted proteins encoded in each of the Armillaria genomes and the potential relationship with associated microbial communities are described, with emphasis on genes related to pathogenicity and fungal/ecological lifestyles. The data presented here contribute to understanding the ecological function of Armillaria species at the genomic level and will serve as resources for understanding genetic and ecological functions of these and other soil fungi.

Fungal Isolates for the Genome Sequencing
Armillaria solidipes (isolate ID001 [27]) was obtained from a culture of a basidiospore derived from a fruiting body belonging to a genet that was causing disease via an active mycelial fan growing below the bark of a live grand fir at the Clearwater National Forest, ID, USA. Armillaria altimontana (isolate 837-10) was obtained from a basidiospore from a fruiting body collected from the forest soil, with no host tree association ca. 2 km from the same location. After basidiospores were dispersed onto water agar, individual basidiospores were identified with a dissecting microscope and subcultured to establish haploid, basidiospore-derived cultures on 3% malt agar medium (3% malt extract, 3% dextrose, 1% peptone, 1.5% agar) for both species.

DNA Isolation
Isolates of A. solidipes and A. altimontana were grown for 3-4 weeks on 0.22-µm pore MF-Millipore™ Membrane nylon filters (MilliporeSigma, Burlington, MA) overlaying a membrane culture medium intended to reduce dark-colored exudates: 1.5% malt extract, 1.5% dextrose, 0.5% peptone, and 1.2% agar. The fresh mycelia (ca. 1-2 g) was scraped off from the nylon filter, ground by mortar and pestle with liquid nitrogen, and the DNA was extracted with the MoBio (Qiagen) DNeasy PowerMax Soil Kit (cat. no. 12988), following the protocol of the manufacturer.

Genome Sequencing and Assembly
PacBio sequencing and assembly of the two Armillaria species genomes were performed at the Laboratory for Biotechnology and Bioanalysis (LBB), Washington State University. Briefly, 10-15 µg of DNA were sheared using Covaris g-Tubes for 10 min at 1350 × g in a Minifuge 16 centrifuge (Beckman Coulter). Approximately, 5 µg of sheared DNA was processed for Pacific Biosciences SMRT bell library preparation following the "Procedure and Checklist-20 kb Template Preparation using BluePippin Size Selection System" (P/N 100-286-000-5) protocol (Pacific Biosciences) and the Pacific Biosciences SMRTbell Template Prep kit 1.0 (P/N 100-259-100). Resulting SMRTbell libraries were size selected using a BluePippin gel purification system (Sage Biosciences) according to the Blue Pippin User Manual and Quick Guide. The 0.75% agarose gel cassette was used with a cutoff limit set to 15-50 kb. The resulting SMRTbell library molecules had an average size of approximately ≥ 18 kb. Appropriate concentrations for the annealing and binding of the SMRTbell libraries were determined using the Pacific Biosciences Binding and Annealing calculator. SMRTbell libraries were annealed and bound to the P6 DNA polymerase for sequencing using the DNA/Polymerase Binding Kit P6 v2.0 (P/N100-372-700), following the recommended protocol from Pacific Biosciences but extending the binding times to 1-3 h, compared to suggested 30 min. The bound SMRTbell libraries were loaded onto the SMRT cells using the standard MagBead protocol and the MagBead Buffer Kit v2.0 (P/N 100-642-800). Then, the standard MagBead sequencing protocol was followed using the DNA Sequencing Kit 4.0 v2 (P/N 100-612-400) (typically known as P6/C4 chemistry). Sequencing data were collected for 6-h movie times, and Stage Start was enabled to capture the longest single reads possible. Genome assemblies were performed within the Pacific Biosciences SMRT Portal software. HGAPII was used following standard defaults for genome assembly.

Genome Assembly and Evaluation
Metrics for the genome assemblies, including scaffolds number, total length, GC content, and N50, were obtained using the QUAST [46] web server. Completeness of the assemblies was evaluated using BUSCO 2.0b2 [47]. BUSCO utilizes sets of genes present as single-copy orthologous in a number of species within a clade. For the evaluation of the Armillaria genome assemblies, the "Fungi dataset" and the "Basidiomycota dataset" were used. The default e-value of 0.001 was kept for the BLAST searches.

Phylogenetic Tree Analysis
Whole-genome phylogenetic tree was created using Realphy 1.12 [48] with Bowtie2 2.3.3.1 [49] for read mapping and PhyML [50] to build the tree. Preset options were used to run the Realphy pipeline. Whole-genome assemblies of Armillaria species available at the National Center for Biotechnology Information or the Join Genome Institute websites and the two genome assemblies described in this work were included. The number of bootstrap replicates in PhyML was set to 200.

Structural and Functional Genome Annotations
A set of repetitive sequences was obtained for each of the genome assemblies of A. solidipes and A. altimontana using RepeatModeler v.1.0.11 [51]. As the first step in the Maker v.2.31.8 pipeline [52], repetitive sequences files were used by RepeatMasker v.4.0.6 [53], to mask and obtain descriptions of interspersed repeats and low complexity DNA sequences. Next, the gene predictors Augustus [54], Gene-Mark-ES [55], and SNAP [56] were used for gene prediction in Maker. TRNAscan-SE [57] was also included to predict tRNA genes. For Augustus, a closely related species, Coprinus cinereus, was used as species model. SNAP was trained with a set of protein-encoding sequences from Armillaria species and closely related species, obtained from the NCBI and EnsemblFungi databases.
The set of proteins generated by Maker was functionally annotated using BLASTp v.2.9.0 + [58] and InterProScan v.5.20-59.0 [59]. Only proteins ≥ 50 amino acids were considered for these and further annotation analyses. For the BLAST search, a database was built with all entries for fungi in the UniProtKB database. A maximum e-value of 0.001 was used. For the InterProScan analysis, the Pfam application was included. Results from BLAST and InterProScan were added to the structural annotation in new gff3 files.
CAZymes were also annotated using the dbCAN2 server [60]; proteins involved in pathogenicity were searched using BLASTp at the PHI database version 46 [61]. Secondary metabolite clusters were annotated using antiSMASH server version 5.1.2 [62] with the detection strictness set to strict, including for the analyses the corresponding genome FASTA files of A. altimontana and A. solidipes. Deeploc 1.0, a program that utilizes deep learning algorithms [63], was used to predict the set of secreted proteins.

Structural Annotation Evaluations
Completeness of the Armillaria proteomes generated by Maker was evaluated using BUSCO 2.0b2. The "Fungi dataset" and the "Basidiomycota dataset" were used as assessed for the genome assemblies. The default e-value of 0.001 was also kept for the BLAST searches.

Synteny Analysis and Visualization
Analysis and graphs of synteny blocks (i.e., genomic regions of conserved gene content) were made using SyMAP 2.4 [64,65]. Genome assemblies and GFF3 files produced by Maker were used to obtain the synteny graphs. Default parameters were used to run the program, except the minimum sequence size was set to 10,000 bp.

Analysis of Orthologous Protein Families
The set of predicted proteins generated by Maker for A. altimontana and A. solidipes were used to identify orthologous and species-exclusive (non-orthologous) groups using the OrthoVenn2 web server [66]. Two parameters can be adjusted when using the OrthoVenn2 web server: e-value and inflation value; they were set to 1e − 10 and 2, respectively. These values were chosen to slightly increase the detection of more non-orthologous proteins compared to the default values (1e − 2 and 1.5, respectively).

Study Area and Field Sampling for Soil Microbial Analysis
The study area was located in northern Idaho at the USDA Forest Service, Priest River Experimental Forest (Fig. 1). The field site was a historic western white pine seed provenance plot within the Ida Creek study area (ca. 48°21ʹ48.75ʺ N and 116°49ʹ25.36ʺ W, elevation ca. 770 m.a.s.l.). In 1971, 2372 seedlings from Idaho and Washington were planted in a common garden plantation [20]. In 1987, all 2076 remaining trees were sampled for diameter at breast height (DBH), height, tree health status, and association with A. solidipes and A. altimontana, as described by Warwell et al. [20].
In 2016, 60 trees were randomly selected for sampling, based on the historical plot distributions of A. solidipes and A. altimontana. Three additional trees (ca. 63) were sampled with needle discoloration and the formation of mycelial fans on the base of the trunk, indicating the presence of A. solidipes. Tree measurements included DBH and tree health status, which was based on total crown, foliage color, insect and disease presence, and dead/live status.
For soil sampling, 1 m from the main stem of a tree near root zones, depths of duff and litter were measured at each cardinal direction in a 30-cm-diameter circle. The area was then cleared, and bulk soil samples were taken for each of the 63 trees using a 15-cm, split soil corer with a 15.9-mm (5/8 inch), compact slide hammer (AMS, #400.99, American Falls, ID). Samples were homogenized, 2 g were placed Armillaria rhizomorphs adjacent to the roots were also excavated using a small Pulaski-like gardening tool and brushes. Primary rhizomorph collections occurred on the same side as the soil core while an additional sample was collected 180° on the opposite side of the tree from the core. Rhizomorphs were placed in 15-ml tubes and placed on ice or 4 °C until isolation and culture.

Armillaria Isolation, DNA Extractions, and PCR
Rhizomorphs were plated for fungal isolation within 7 days of collection. Each rhizomorph was surface sterilized by an initial rinse with sterile-distilled water to remove the attached soil particles, followed by a soak in 20% Clorox® bleach solution (1.5% sodium hypochlorite, final concentration) for 6-10 min, a rinse with sterile-distilled water, and a soak in 3% hydrogen peroxide for 6-10 min. After a final rinse with sterile-distilled water, small rhizomorph sections (ca. 1-cm length) were plated onto 3% malt agar medium and incubated at 22 °C in the dark to promote mycelial growth of Armillaria.
For DNA extractions from Armillaria cultures, mycelia were sub-cultured onto Millipore™ membrane nylon filters that overlaid on membrane culture medium. After 2-3 weeks, mycelia were scraped off the nylon filters, and DNA was extracted from > 50 mg of mycelia using Zymo Fungal/Bacterial DNA extraction kits (Irvine, CA), following manufacturer protocols with a few modifications. To maximize DNA quantity and quality, three 3-mm glass beads were added to the cell lysis step prior to homogenization (Thermo Savant FastPrep ® FP120 Cell Homogenizer; Qbiogene, Carlsbad, CA) at 6.0 speed with two 30-s cycles. DNA concentration and quality were quantified using a Nan-oDrop™ 2000 spectrophotometer (Thermo Fisher Scientific, Wilmington, DE).
For species identification, DNA was amplified at the translation elongation factor-1α (tef1) locus using primers EF-983 and EF-2218 [67] with an Eppendorf Mastercycler pro Thermal Cycler (Eppendorf, Hamburg, Germany). The PCR cycle was 94 °C for 2.5 min, 30 cycles of 94 °C for 30 s, 60 °C for 30 s, and 72 °C for 1.5 min, with a final cycle at 72 °C for 10 min. PCR products were visualized using gel electrophoresis, cleaned with ExoSAP-IT™ PCR Product Cleanup Reagent (Thermo Fisher Scientific, Santa Clara, CA), and then Sanger sequenced in two directions by Eurofins Genomics (Louisville, KY). Sequences were edited and aligned in Geneious R11.1 (https:// www. genei ous. com). Aligned sequences were identified by comparing to the NCBI (National Center for Biotechnology Information) database using BLASTn (https:// blast. ncbi. nlm. nih. gov/ Blast. cgi) [58].

Soil DNA Extraction Protocol and Sequencing
DNA was extracted from the soil samples preserved in Life-Guard™ Preservation Solution using MoBio Powersoil Total RNA Isolation and DNA Elution Accessory kits (Qiagen®, Carlsbad, CA), following manufacturer protocols. DNA qualification and quality were measured using a Nanodrop™ 2000 spectrophotometer.
Soil DNA (30 µl) was sent to the University of Minnesota Genomics Center and Colorado State University Next-Generation Sequence (NGS) lab for library preparation and sequencing on an Illumina MiSeq and paired-end 2 × 250 reads were generated. Out of 63 samples, a total of 57 were sent for sequencing; the six remaining samples were Split pixels represent trees that were associated with both A. altimontana and A. solidipes (Warwell et al., 2019, 20) excluded because they did not yield sufficient DNA concentration/quality. Libraries were prepared for the internal transcribed spacer (ITS2) region to sequence fungal communities and the v4 genomic region of the 16S rRNA to sequence bacterial communities. Primers ITS3 (5′-GCA TCG ATG AAG AAC GAG C-3′) and ITS4 (5′-TCC TCC GCT TAT TGA TAT GC-3′) [68] were used to amplify the ITS2 region, and primers 515F (5′-GTG CCA GCMGCC GCG GTAA-3′) and 806R (5′-GGA CTA CHVHHHTWT CTA AT-3′) [69] were used to amplify the v4 region of the 16S rRNA. DNA-free samples were included as negative controls to verify lack of microbial or DNA contamination in the buffers and primer sets. These sequence data have been submitted to the NCBI SRA database under accession number PRJNA767898.

Cleaning DNA Sequence Data
Data were cleaned to ensure base-calling accuracy of ≥ 99.9% using the paired-end mode in the program Trimmomatic v0.36 [70]. Sequences ≤ 100 bp in length, lowquality bases scores (≤ 15), and 4 bp sliding window regions with low average-quality scores (≤ 25) were removed from the data set. The software Mothur v1.40.5 [71] was implemented utilizing the standard operating procedure [72], with some adjustments, to call operational taxonomic units (OTUs) and classification of taxa. Following adjustments described in the SOP (https:// github. com/ Abdo-Lab/ Micro biome-Analy sis-Scrip ts/ blob/ master/ PE-de-novo-proce ssing. pl), UCHIME [73] was used for de novo identifications and removal of de novo chimeric sequences, and USEARCH [74], utilizing the dgc (distance-based greedy clustering) option, was used for clustering. Groups that were ≥ 97% similar were classified as belonging to the same OTU. Sequences associated with lineages of chloroplast, mitochondria, archaea, and bacteria were removed from the table of classified sequences. We utilized the 128 Silva database [75] and the UNITEv6_sh_dynamic_s [76] databases for bacterial and fungal taxonomic classifications, respectively, using Wang's Naïve Bayes classifier with a cutoff value of 80 [77]. Rarefaction curves were generated using the package "vegan" as implemented in R version 3.6.1 to assess diversity and suitability of depth of coverage per sample [78].

Statistical Analysis of Soil Microbial Communities
Using the RStudio interface to R (R Core Team 2017), alpha diversity, including Shannon diversity index and Inverse Simpson, was calculated using phyloseq [79], and rarefied richness (Richness) was determined in Vegan. Shannon's index was used to determine diversity utilizing the relationship to richness and rare microbes [80,81]. Inverse Simpson was used to identify diversity based on evenness and more dominant microbes [81]. Richness was considered as the number of individuals identified within a single sample, while evenness was used to explain the relative abundance of the different individuals [82].
The relative abundance of taxa associated with A. solidipes-and A. altimontana-associated soils was determined for the top fungal and bacteria taxa using a stacked bar graph with the metagenomeSeq package in R [83]. Differences among communities associated with Armillaria species were assessed using a PERMANOVA. Principle component analysis plots were completed in vegan to visualize fungal and bacterial soil differences associated with each Armillaria species.
Utilizing relative abundance data based on the resulting OTU table, bar graphs were generated using the ggplot2 package [84] in R for observed taxa with relative abundance > 1% at the genus level to describe the microbial community structure associated with each Armillaria species. The metagenomeSeq package [83] in R was used to fit a model that identified those OTUs associated with significance of model fit at a 0.01 level and minimum fold change of 2 (p values were adjusted for multiple testing), which was used to identify the driver of OTU differences between treatments. Core fungal and bacterial communities were created for each Armillaria species. Counts were calculated in R to assess the presence of an OTU corresponding to each species of Armillaria. Venn diagrams were produced using molbiotools.com to identify unique and shared fungal and bacterial taxa associated with A. solidipes and A. altimontana.
To identify influences of soil chemistry properties on soil fungal communities, a PERMANOVA analysis was completed using the vegan package in R. The analysis identified significant predictors by completing a forward stepwise analysis based on the subset of variables that minimized the Akaike information criterion (AIC).

Genome Assemblies of A. solidipes and A. altimontana
The PacBio assemblies resulted in a 73,739,702-bp genome for A. altimontana isolate 837-10 and a 55,735,298bp genome for A. solidipes isolate ID001; both isolates originated near Elk River, ID, USA ( Table 1). The ratio of genome sizes, A. altimontana/A. solidipes = 1.32, is consistent with the ratio of the reported DNA content per nucleus of these two species, 1.34 [19]. The corresponding genome assemblies were deposited at the NCBI with accession numbers JAIWYR000000000 for A. altimontana and JAIWYQ000000000 for A. solidipes.
In a whole-genome phylogenetic tree, two A. solidipes isolates from North America, ID001and 28-4, group together (Fig. 2), apart from but close to isolate C18/9 of A. ostoyae from Europe [28]. Figure 2 also shows the position of A. altimontana with respect to A. solidipes and other Armillaria species. Armillaria altimontana is contained within a clade comprising A. cepistipes B5 and A. gallica Ar21-2, which is distinct from the A. solidipes/ostoyae clade. After a custom library of repeats obtained using Repeat-Modeler was input to RepeatMasker, more bases in A. altimontana (18,346,415 bp) were masked compared to those in A. solidipes (9,691,790 bp). When comparing A. altimontana to A. solidipes, the relative proportion of masked bases (1.89) was larger than the ratio of their genome sizes (1.32). The percentage of genomic sequences occupied by interspersed repeats and low complexity DNA regions for A. altimontana and A. solidipes were 24.88% and 17.39%, respectively (Table S1); the largest percentages corresponded to retrotransposons. The most abundant retrotransposons were long terminal repeats (LTRs) as is common in other fungi [85].
Completeness of the genome assemblies was assessed using BUSCO using datasets for both the fungal and basidiomycota lineages. The A. altimontana genome assembly was 95.1% complete when compared to the fungal dataset and 96.7% when compared to the Basidiomycota dataset. The completeness values for A. solidipes were similar at 95.9% and 96%, respectively, and similar to those reported for other Armillaria species [28], which indicates high quality for genome assemblies.
Large blocks of shared synteny were found when comparing the A. altimontana and A. solidipes genomes (Fig. 3A shows the 20 largest scaffolds of each species), especially for some of the largest scaffolds of each species

Structural and Functional Annotation
The Maker annotation pipeline predicted several features for the genome assemblies, including CDs, exons, 5′-UTRs, genes, mRNAs, 3′-UTRs, and tRNAs, which were organized in GFF3 files (  Files 3 and 4) were functionally annotated using BLASTp against all the fungi entries in the Uniprot database and by using InterProScan including the Pfam application. These results were added to the final genome models produced by Maker, in GFF3 format ( Supplementary Files 1 and 2

Secreted Proteins
The program Deeploc was used to obtain corresponding sets of putative secreted proteins of A. altimontana and A. solidipes to search for differences that might reflect their lifestyle differences. A total of 1235 (6.4% of the total) secreted proteins were predicted in A. altimontana, and 1157 (7.1%) were predicted in A. solidipes. In A. altimontana, 322 secreted proteins had a CAZyme annotation, and 2 were cytochrome P450; in A. solidipes, the number of hits in each category was similar: 316 as CAZYmes, and 3 were cytochrome P450. No secreted proteins from either species had a blast hit with identity above 95% to proteins in the PHI database (data not shown). Some of the proteins had a CAZyme and a BLASTp hit, with one or several hits in the InterProScan search. But 99 secreted proteins in A. altimontana produced no hits and another 436 produced only BLASTp hits to uncharacterized proteins; in A. solidipes, 69 secreted proteins produced no hits and another 421 produced only BLASTp hits to uncharacterized proteins. However, many of these uncharacterized proteins could be considered "small secreted proteins" (see below). All those different annotations were combined and manually curated (Supplemental Files 3 and 4).
Numbers of secreted proteins with putative involvement in pathogenicity were obtained for each Armillaria species. The differences between the two species were small (Fig.  S1); the two major differences were a higher number of peptidases secreted by A. solidipes and a higher number of small secreted proteins for A. altimontana.
When grouped by probable function (Fig. 4), the major differences in predicted secreted proteins of A. altimontana and A. solidipes were associated with cell wall-degrading enzymes. Armillaria solidipes showed a slightly larger number of enzymes that degrade plant cell wall components: cellulose, hemicellulose, lignin, and especially pectin. Encoded protein-degrading enzymes also were more abundant in A. solidipes compared to A. altimontana (Fig. 4). Abundances of other encoded protein categories showed smaller differences.
The number of encoded proteins that could be considered "small secreted proteins," defined as those smaller than 300 amino acids (after being predicted as "extracellular"), was 678 (~ 55% of total secreted) in A. altimontana, 381 with ≥ 2% cysteine residues, and 594 (~ 51% of total secreted) in A. solidipes, 334 with ≥ 2% cysteine residues. Numerous encoded small secreted proteins (205 in A. altimontana and 172 in A. solidipes) were annotated as

Non-secreted Proteins
Numerous different functions were found among encoded proteins considered as non-secreted. Among them, those that matched CAZymes, cytochrome P450, transporters, or secondary metabolite clusters were further examined (Table 3). Transporters and secondary metabolites clusters were also included in these analyses because they have also been considered important for the lifestyle of fungal species [86,87]. The abundance of encoded proteins annotated as CAZymes, ABC transporters, and secondary metabolite clusters was similar between A. altimontana and A. solidipes (Table 3), whereas numbers of cytochrome P450 and all transporters were larger in A. altimontana. However, the ratio A. altimontana/A. solidipes encoded protein numbers for most categories was smaller than the ratio of the genome sizes (1.32) and total proteins (1.18); only cytochrome P450 ratio was slightly higher (1.25) than the ratio of total proteins (Table 3). When the abundance of the non-secreted CAZymes was grouped by substrate, the largest differences were found within encoded pectin-degrading enzymes with 58 in A. altimontana and 47 in A. solidipes; carbohydrate binding with 17 and 8, respectively; and lignin-degrading enzymes with 49 and 41, respectively (Fig. S2). Overall, most non-secreted CAZyme numbers were typically higher in A. altimontana in comparison with A. solidipes.

Genes Upregulated in Rhizomorphs
We searched for genes reported by Sipos et al. [28] as notable genes that were upregulated in rhizomorphs. Most of the categories had similar numbers between both Armillaria species, although A. altimontana possessed 62 more genes encoding cytochrome P450 (Table 4). A diversity of functions has been ascribed to cytochrome P450 proteins [88][89][90][91]. Caspase domain-containing proteins, part of proteases that have been associated with programed cell death in other organisms [92], were more abundant (10 more) in A. solidipes (Table 4). Relatively large differences were also found in numbers of genes encoding two enzymes involved in secondary metabolites synthesis: polyprenyl synthase, involved in terpenoid synthesis [93,94]: A. altimontana had 23 and A. solidipes had 12. In contrast, genes encoding trichodiene synthase, which utilize terpenoids to produce the trichodiene [94], were more abundant in A. solidipes with 12, compared to A. altimontana with only had three (Table 4).

Orthologous and Non-orthologous Proteins
Although approximately 62% of A. altimontana and 72% of A. solidipes predicted proteins grouped in 10,989 clusters of orthologous proteins, a large number, 7232, of proteins were non-orthologous in A. altimontana, and 4575 were non-orthologous in A. solidipes (Fig. 5A).
Out of the 10,989 clusters of orthologous proteins, 10,321 were two-protein clusters, which composed of one protein from each species; only 29 clusters had a difference larger than five proteins (Table 5). Of those, A. altimontana had more proteins in 24 clusters, whereas A. solidipes had more proteins in five clusters. Out of those 29 clusters, one cluster contained CBM67 proteins, which bind rhamnose residues in pectin, with 15 proteins from A. altimontana versus only one from A. solidipes. Another cluster contained ABC transporters, of which, A. altimontana also had 10 more than A. solidipes. Two clusters contained caspase domain proteins, with 17 more from A. solidipes than from A. altimontana. Other clusters corresponded to transposases, transcription factors, helicases, F-box proteins, and histone-modifying enzymes, while no annotation was found for 14 clusters (Table 5).
CAZymes and cytochrome P450 enzymes were found among non-orthologous proteins ( Table 5). The number of non-orthologous CAZymes was 91 in A. altimontana and 80 in A. solidipes, with small differences in number of individual CAZymes between the two species, which are similar to the differences found in secreted and non-secreted CAZymes. A few non-orthologous CAZymes were exclusive, but only GT32 was found exclusively in A. altimontana among CAZymes with a count > 5. For cytochrome P450,  Table 3). Many other proteins that were present in both A. altimontana and A. solidipes with the same Pfam-Interpro annotation were still considered non-orthologous by OrthoVenn2, and their numbers were also similar in most cases. Those with a number difference > 5 included polyprenyl synthase, trichodiene synthase, F-box protein, and glutathione S-transferase (Table 5). Other proteins were present only in one Armillaria species; most of these occurred in small numbers: three with counts > 5: Clp amino terminal domain pathogenicity island component; DDE superfamily endonuclease, only in A. altimontana; and sodium/hydrogen exchanger family, only in A. solidipes. Other non-orthologous proteins with smaller numbers, but with a possible functions in host-pathogen interactions, included terpene synthase (4) and transcriptional activator of glycolytic enzymes (4), which were found only in A. altimontana (Table 5).
Finally, 344 A. altimontana non-orthologous proteins were predicted as secreted with 327 of these representing small secreted proteins. For non-orthologous proteins from A. solidipes, 265 were predicted as secreted with 248 of these representing small secreted protein (Table 5).

Armillaria Species Identified from Field Plots
Rhizomorphs were isolated from 51 total trees, yielding 87 rhizomorph samples that all produced pure Armillaria cultures. Based on tef1 sequencing of the 87 cultures from rhizomorph samples, 48 trees were associated with A. altimontana, and three trees were associated with A. solidipes. Rhizomorph isolation was unsuccessful for 12; therefore, these samples were not utilized in analyses. Sequences corresponding to both A. altimontana and A. solidipes resulted in 99% identity during BLAST searches on the NCBI database.

Processing Sequenced 16S and ITS2 Libraries in Mothur
From the soil samples, a total of 2,156,476 and 4,323,028 raw paired-end 2 × 250 bp reads from 56 samples were generated from 16S and ITS sequencing, respectively. For the 16S dataset, the mean sequencing depth after  23 12 processing was 27,639 reads/sample, with a range from 6 to 107,582. Eighteen samples yielded < 5000 total reads and were removed from analyses for the 16S dataset. For the ITS dataset, the mean sequencing depth after processing was 51,806 reads/sample, with a range from 15,017 to 77,969. The total datasets yielded 26,781 and 6936 OTUs for the 16S and ITS2, respectively. The resulting rarefaction curves for these 16S sequence data indicate adequate sampling depth (Fig. S3). Matching to the Silva database resulted in a 16S dataset of 6677 unique OTUs, and matching to the UNITE database resulted in an ITS2 dataset of 2806 unique OTUs.

Differences in Community Alpha Diversity and Richness
Bacterial and fungal communities were assessed for overall rarefied richness and diversity (Fig. S3). We did not observe significant differences in richness among soil fungal communities associated with A.  Table S2). Additionally, A. altimontana had a slightly significant positive relationship (P = 0.053) and soil moisture had a significant negative relationship (P = 0.013) with fungal richness (Table S3). In the diversity analyses, the Shannon's diversity model was not significant (P = 0.489), while A. altimontana (P = 0.067; positive) and soil moisture (P = 0.078; negative) both had an influence on diversity. No edaphic variables significantly correlated to inverse Simpsons diversity measures across fungal communities (P = 0.558; Table S3). Soil moisture had a significant negative relationship with bacterial richness (P = 0.049; Table S4). Both A. solidipes (P = 0.039) and soil moisture (P = 0.022) had a significant negative relationship with bacterial Shannon's diversity. The bacterial inverse Simpson diversity model was significant (P = 0.0423), with soil moisture (P = 0.024, negative) as the lone significant predictor (Table S4).

Bacterial and Fungal Beta Diversity
Principal components analysis (PCoA) was completed to quantify beta diversity between bacterial and fungal communities associated with each Armillaria species. Beta diversity associated with soil bacterial communities of A. altimontana and A. solidipes were not significantly different (P = 0.544), and this is observed in the PCoA plot (Fig. S4A). Axes 1 and 2 described 21.7 and 12% of the variation, respectively. We observed that beta diversity indices were significantly different for fungal soil communities associated with Armillaria species as well (P = 0.016) (Fig. S4B). Compared to the bacterial communities, axes 1 and 2 described less of the fungal variation at 9.67 and 7.25%, respectively.

Core Communities Associated with Armillaria Species
Venn diagrams were constructed to identify the individual and core bacterial and fungal communities (Fig. 5 A  and B). Of the 6677 total OTUs, the core bacterial communities for soils associated with both Armillaria species consisted of 955 OTUs (14.3%). While a significant abundance, 5643 OTUs (84.5%), was uniquely associated with A. altimontana, only 79 (1.2%) were uniquely associated with A. solidipes (Fig. 5B). The core fungal community associated with both A. altimontana and A. solidipes consisted of 521 OTUs (18.6%). Far surpassing the core community, 2219 OTUs (79.1%) were unique to A. altimontana-associated soils, whereas only 66 (2.4%) OTUs were unique to A. solidipes-associated soils (Fig. 5C).

Taxonomic Trends and Relative Abundance
There were 17 16S bacterial families that exceeded the relative abundance of 1% (Fig. 6A). All 17 families were in soils associated with A. altimontana. Pseudomonadaceae was found in high abundance followed by Chthoniobacteraceae and Pyrinomonadaceae with A. altimontana. We observed a large relative abundance of Enterobacteriaceae, followed by Pseudomonadaceae with A. solidipes (Fig. 6A).

MetagenomeSeq analysis
We identified a total of four bacterial taxa that contributed significantly to the differential comparison between Armillaria species using the magnitude of OTU log-fold change (Fig. 7A). A proliferation, at 90% confidence, of Nitrosococcaceae (wb1-P19), Solirubrobacteraceae, Enterobacteriaceae, and Gammoproteobacteria_PLTA13_fa was found in A. solidipes-associated soils, whereas only uncultured bacteria were found to be significantly greater in A. altimontana-associated soils. We identified a total of five fungal taxa that contributed significantly to the comparison between Armillaria species using the magnitude of OTU log-fold change at the 90% confidence level (Fig. 7B). These analyses identified a proliferation of Atheliaceae, Suillaceae, Rhizopogonaceae, and unclassified fungi in A. altimontana-associated soils, whereas only a single OTU (unclassified fungi) was significantly more abundant in A. solidipes-associated soils.

Discussion
We report the high-quality genome assemblies with structural and functional annotations for two Armillaria species (A. altimontana and A. solidipes) that display different ecological lifestyles. In addition, we examined the potential role and relationship among microbial communities that may correspond with the different ecological behaviors of A. altimontana and A. solidipes. Armillaria isolates were obtained from a conifer forest in northern Idaho within the interior western USA, where A. altimontana primarily behaves as a saprophyte and potentially beneficial biocontrol agent enhancing the growth/survival of western white pine [20] and A. solidipes primarily acts as an aggressive pathogen of diverse conifers [11,13,95].  [28]. Although it has been proposed that A. solidipes and A. ostoyae are a single species [97], genomic differences result in a separate grouping for A. ostoyae (C18/9) from Europe, which is distinct from the two A. solidipes isolates from North America in the wholegenome phylogenetic tree. Thus, the phylogenomic analysis further supports that North American A. solidipes is distinct from Eurasian A. ostoyae.
Armillaria altimontana has a larger genome and a proportionately larger number of protein-coding genes compared to A. solidipes. It had twice as many sequences coding for repetitive elements (18,346,415 bp) than A. solidipes (9,691,790 bp); however, this difference is less than would be accounted for from the larger genome size alone (73.7 Mbp versus 55.7 Mbp, respectively). Rather, it has been suggested that gene family expansion has driven the increase of genome sizes in Armillaria in comparison with other Agaricales [28], and this could also be the mechanism responsible for the expanded genomes of A. altimontana, A. cepistipes (75.5 Mbp), and A. gallica (85.3 Mbp). Nevertheless, the genomes of A. altimontana and A. solidipes shared large blocks of synteny (i.e., large blocks of gene order when comparing the two genomes) suggesting that their gene sets are similar within the genomes of two Armillaria species. Interestingly, A. altimontana genome encodes only slightly more secreted proteins, secreted CAZymes, ABC transporters, and secondary metabolite clusters, but with slightly fewer tRNA genes compared to A. solidipes. Though A. altimontana has a larger genome than A. solidipes, the genomes are similar in synteny and in gene content; however, A. altimontana contains considerably more predicted non-secreted proteins.
Although relatively few genomic differences were observed, genome signatures of lifestyle differences between A. solidipes and A. altimontana were highlighted by the variation in putative secreted proteins. Approximately 1200 encoded proteins in the two Armillaria genomes were found as potentially secreted, which could be potentially also considered as potential "effectors" -important proteins for interactions with a host [98]. Both species were well-equipped with genes encoding enzymes to degrade cell wall components, including cellulose, hemicellulose, lignin, pectin, proteins, and others. The major differences between the two Armillaria species are that A. altimontana had more carbohydrate-binding enzymes, beta-glucan-degrading enzymes, and more proteins predicted to be involvement in host interactions (hydrophobin, cerato-platanin), especially small secreted proteins [99]. In contrast, the genome of A. solidipes encoded more secreted putative cellulose, hemicellulose, pectin, and lignin-degrading enzymes. The combination of these secreted proteins could confer A. solidipes with a higher ability to infect and cause damage to its host. However, this hypothesis requires functional tests, because the number of CAZymes varies widely when comparing fungi with similar or different lifestyles [100]. Also, it has been found that phylogenetic history can have a more important influence on secretome composition than lifestyle [101].
We found that more than half of predicted proteins in A. altimontana and almost two-thirds of A. solidipes predicted proteins could be considered orthologous. Most of these clusters had very similar numbers of proteins that were encoded in the genome of each Armillaria species. In contrast, all A. altimontana CBM67 proteins (15) were in a cluster, with only one CBM67 protein encoded by A. solidipes. Of the 15 A. altimontana CBM67 proteins, 11 were predicted as secreted, as was the one A. solidipes CBM67 protein. CBM67 is one of several CBM considered "lectin-like," and we speculate that in Armillaria spp., CBM67 proteins may have an additional functionality other that help in pectin degradation, such as interactions with the host and other organisms [102], particularly in A. altimontana which contains more CBM67 genes.
Among non-orthologous proteins, another major difference was observed in the total numbers of gene encoding cytochrome P450 enzymes: A. altimontana had considerably more both non-orthologous cytochrome P450 genes and a higher total of all cytochrome P450 genes, which are known to be involved in numerous metabolic pathways and biological processes including degradation of lignin and xenobiotics, secondary metabolite synthesis, and adaptation to different environments [88][89][90][91]. The versatile activities of cytochrome P450 enzymes make it difficult to assign a specific function for them, but we speculate that a larger number of genes encoding these enzymes, including many non-orthologous enzymes, could be associated with different lifestyles, in this case more saprophytic lifestyle for A. altimontana and a more pathogenic lifestyle for A. solidipes. A recent report also found a larger number of genes coding for cytochrome P450 in the saprophytic A. cepistipes compared to the pathogenic A. ostoyae [103].
Although non-orthologous proteins could have the similar molecular functions, their sequence differences could change their interactions with their substrates, regulation, or environmental optima [104]. Furthermore, the expression levels and the timing of expression could account for important ecological differences in how the two Armillaria species interact with their hosts [103]. Variations in the expression of many genes, including some related to pathogenicity, have been observed even among strains of the same species, and these variations are associated with different levels of virulence during host infection [105].
Rhizomorphs are an important and unique means by which that Armillaria species interact with their environment, hosts, and substrates. Differences in microbial communities associated with each Armillaria species can perhaps be putatively attributed to genes encoding enzymes similar to those secondary metabolite synthesis enzymes previously identified in Armillaria species [28]. We observed some differences when comparing the number of genes upregulated in rhizomorphs in the two Armillaria species. For example, polyprenyl synthase genes involved in terpenoid synthesis were more abundant in A. altimontana compared to A. solidipes, whereas genes involved in the production of trichodiene, a potential signaling molecule or mycotoxin [106], were more abundant in A. solidipes. In general, terpenoids can have many different structures and functions, which have been involved in the interaction between fungi and plants and other organisms [107][108][109][110]. Trichodiene, on the other hand, is the immediate precursor to a family of toxins that cause damage to plant hosts [111]. Although overall differences of soil microbial communities were not observed in association with the two Armillaria species, several bacterial taxa were more differentially abundant in soils associated with A. solidipes, and several fungal taxa were more differentially abundant in association with A. altimontana. We identified the most significant logfold change of three Proteobacteria taxa within the Gammaproteobacteria class and Enterobacteriaceae in association with A. solidipes. Interestingly, one of these included Nitrosococcaceae wb1-P19, which is thought to be a nitriteoxidizing autotrophic bacteria and that has previously been observed in caves [112][113][114]. In contrast, Gammaproteobacteria PLTA13_fa was found in high numbers in a Mn oxideproducing biofilm [115]. With these unique characteristics, it is not unexpected that some taxa within Proteobacteria have been characterized in soils contaminated with pesticides [116]. A recent study examining healthy ginseng (Panax ginseng) and ginseng with rusty root disease also found that several Proteobacteria were found in high abundances among diseased plants [117], and increases in Proteobacteria were also observed in association with changes of cover types from forest to grasslands in Hawai'i [118]. In addition, Enterobacteriaceae taxa have been associated with higher levels of Fusarium wilt disease of banana [119], but these taxa were also found in higher levels in asymptomatic Kauri trees compared to those infected with Phytophthora agathidicida [120]. Similar to our study, however, Byers et al. [120] found that Solirubrobacterales were more abundant with trees in decline. Przemieniecki et al. [121] surveyed bacterial communities associated A. ostoyae rhizomorphs during three stages of tree decline. They observed that rhizomorphs that were rich in Parabacteroides, Clostridium, and Bacillus were able to hydrolyze diverse organic compounds that could assist Armillaria rhizomorph enzymes in wood decomposition. Though these taxa were not present in our study, the high abundance of several taxa in the Proteobacteria and Enterobacteriaceae suggests potential recruitment of bacterial taxa by A. solidipes to assist in wood degradation. Alternatively, these results could suggest that the abundance of these taxa may be associated with tree mortality and/or changes in the plant community due to the activity of A. solidipes. Furthermore, as the rhizosphere of the infected tree begins to degrade, these taxa may thrive because of their unique abilities to breakdown complex plant root materials. Several studies have observed changes in bacterial communities associated with declines in plant communities [122][123][124]. More research is needed better understand signaling among members of the pathobiome during tree decline that could foster changes in the associated bacterial communities.
We identified several fungal taxa that were more significantly more abundant in association with A. altimontana. Several of these taxa have been shown to increase plant productivity through multiple functions, such as ectomycorrhizal fungi including Atheliaceae, Rhizopogonaceae, and Suillaceae [125][126][127][128]. Taxa from all three ectomycorrhizal fungal families were significantly more differentially abundant in soils associated with A. altimontana compared to A. solidipes, allowing increased uptake of water and nutrients to enhance tree defenses against root diseases [31]. The functions of these A. altimontana-associated soil fungi suggest that these fungal communities may also contribute to the overall health of the forest stand, corroborating Warwell et al. [20], who found that trees associated with A. altimontana were larger in both diameter and height than trees not associated with this Armillaria species. It remains unknown if A. altimontana is conducive to mycorrhizal fungi though evidence provided herein suggests that A. altimontana cooccurs with mycorrhizal fungi.
Utilizing the naturally occurring soil fungal communities to assist in the management of Armillaria root disease may be key to long-term protection of residual trees on sites infested with pathogenic Armillaria spp., such as following Armillaria root disease-associated mortality or silvicultural thinning practices. Beneficial microbes can minimize pathogen inoculum loads by reducing pathogen growth or inhibiting pathogen infection of susceptible hosts [41]. In this study, a greater diversity of mycorrhizal and saprophytic fungi was observed in association with the beneficial/nonpathogenic A. altimontana, demonstrating that mycorrhizae may have a direct influence on hosts within forested environments associated with Armillaria species [125].
Selecting trees to sample Armillaria species was the greatest limiting factor in this study. More than 25 years before this study, A. solidipes was well represented on the site [20]. The small number of A. solidipes-infected trees in this study perhaps reflects the protective role of A. altimontana and the associated microbial community in suppressing A. solidipes; however, additional studies and surveys are needed to support this hypothesis. The survey approaches used in our study yielded rhizomorphs for 78% of the trees and adequate DNA from 90% of the samples. Additionally, the use of metatranscriptomics could further our understanding of the fungal microbes and their ecological functions within the soils associated with A. altimontana and A. solidipes.
In conclusion, we found high similarity comparing the genomes of between the beneficial/non-pathogenic A. altimontana and pathogenic A. solidipes. The larger number of proteins encoded within A. altimontana genome results from moderate increases across many different gene families instead of a large expansion of a few gene families. However, we found many relatively small differences in genes that could contribute to differences in ecological lifestyles and interactions with woody hosts and soil microbes (fungi and bacteria). We did observe, however, that soil microbial communities may act in concert with A. altimontana to produce suppressive soils that help protect trees from Armillaria root disease, caused by A. solidipes. This study further suggests that novel approaches for managing Armillaria root disease could be based on management practices that favor naturally occurring, non-pathogenic Armillaria spp. and other beneficial soil microbes that suppress Armillaria root disease. Additionally, continued observations of microbial communities in association Armillaria spp. will provide additional insights on microbial changes over time in relation with Armillaria root disease and changing forest environments.