B. bassiana strains
The eight strains of BB were obtained from several culture collections previously (26). The strains and their phenotypes are described in Table 1.
Extraction and detection of oosporein from Beauveria bassiana
Eight morphologically similar BB strains, with variable pigmentation, were selected and grown to obtain detection limits for oosporein production. The starting fungal inoculum was reactivated using Czapex-Dox Yeast Extract Agar medium, and incubated at 25oC for 4–6 weeks, or until conidial lawn is at its maximum. The conidia were harvested from the agar media, titered, and inoculated a 25 mL Czapex-Dox Yeast Extract Broth (CDBYE) medium in a 250 mL Erlenmeyer flask to a final concentration of 1 x 107 conidia/mL and incubated at 28oC for 72–96 h at 175 rpm. After incubation, a 1 mL inoculum was used to inoculate a 250 mL CDBYE medium in a 1000 mL Erlenmeyer flask and incubated at 28oC at 175 rpm. After 5 days of incubation, duplicate mycelial mixtures (2 x 30 mL) were harvested through centrifugation at 10,000 x g for 15 min at 4oC. The supernatant was transferred to a new sterile conical tube and the resulting mycelial pellet was washed twice with ice-cold TE buffer (10 mM Tris/1 mM EDTA, pH 8.0), and the supernatant was combined with the previous sample. The mycelial cultures were incubated further, and the extraction process was carried out again after day 10 and 15. The supernatant was extracted with ethyl acetate (4 x 20 mL), concentrated in vacuo, and purified through high-performance liquid chromatography (HPLC). Oosporein concentration was detected using LC-MS through comparison to a previously synthesized standard (78). Lastly, the limit of detection of oosporein was performed using 10-fold serial dilutions and analyzed via LC-MS.
DNA Sample Preparation for Genomic Analysis
BB strains were grown to establish high quality draft genome reference sequences. The starting fungal inoculum was obtained from frozen mycelial stocks and re-activated using Czapex-Dox Yeast Extract Agar (CDAYE) medium, and incubated at 25oC for 4–6 weeks, or until the conidial lawn is at its maximum. The conidia were harvested from the agar media, titered, and inoculated a 100 mL CDBYE broth medium in a baffled 500 mL Erlenmeyer flask to a final concentration of 1 x 107 conidia/mL and incubated at 28oC for 72–96 h at 175 rpm. Mycelia were harvested through centrifugation at 5000 x g for 20 min at 4oC. The resulting mycelial pellet was washed twice with ice-cold TE buffer, and flash frozen in liquid nitrogen, and stored at -80oC until processing. Quality control analysis for bacterial contamination was performed for all samples. An aliquot of the mycelia was serially diluted ten-fold up to 1.0 x 106 and all dilutions were spread-plated, in quadruplicate, on Standard Methods Agar (SMA, BD Difco Laboratories, Sparks, MD, USA) with 100 U/mL nystatin. Duplicate plates were incubated at either 28oC or 35oC for 28–72 h to assess the bacterial load. Frozen mycelial pellets were sent to the Michael Smith Laboratories, University of British Columbia, for DNA extraction.
DNA Extraction
Fungal DNA was isolated from lyophilized mycelium (~ 5 g) with a modified protocol of Doyle and Dickson (53). The modifications included adding 3% mercaptoethanol to the lysis extraction buffer, incubation at 60oC for 45 minutes and washing the initial pellet with 75% ethanol with 10 mM ammonium acetate. An additional solvent cleaning with 1:1 phenol:chloroform-isoamyl solution was included after incubation with RNAse and proteinase K at 37oC. The concentration and quality of DNA was verified with a Nanodrop 2000c (ThermoFisher Scientific, Waltham, MA, USA), Quantiflor (Promega Corporation, Madison, WI, USA) and 0.8% agarose gel. 3–4 µg total DNA for all 8 strains was sent to Canada’s Michael Smith Genome Sciences Centre for sequencing.
Whole Genome Sequencing
A microfluidic partitioned library was created using the Chromium system (10x Genomics, Pleasanton, CA, USA). Gel beads-in-Emulsion (GEMs) were produced by combining DNA, Master Mix, and partitioning oil in the 10x Genomics Chromium Controller instrument with the microfluidic Genome Chip [PN-120216] (10x Genomics). The DNA in each GEM underwent isothermic amplification as a barcode was added to each fragment. Barcoded fragments then underwent Illumina library construction, as per the Chromium Genome Reagent Kits Version 2 User Guide [PN-120229].
The resulting library was assessed for quality using the Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA) and a DNA 1000 assay. The median insert size was 550 bp. The library was quantified using a Quant-iT dsDNA High Sensitivity Assay Kit on a Qubit fluorometer (Invitrogen, Waltham, MA, USA) prior to library pooling and size corrected final molar concentration calculation for Illumina HiSeqX sequencing with paired-end 150 base reads. The Long Ranger BASIC pipeline (v2.1.3) was run on the raw reads to perform read trimming, barcode error correction, whitelisting and barcode assignment (54). The reads were downsampled by barcode to 18 million reads (approximately 80X sequencing depth) for further analyses.
De novo Genome Assembly
The genomes were assembled using Supernova v2.1.1 (55), which leverages the long-range information provided by linked reads. Since read coverages were higher than the recommended range of 38-56X for assembly, 15 million reads were selected (approximately 56X coverage) using --maxreads = 15000000. The output FASTA files were created using the --style = pseudohap option which generated a single record per scaffold, and scaffolds shorter than 1 kb were excluded from the final assembly. The draft genomes were polished by aligning 80X downsampled reads to their respective genomes using BWA-MEM v0.7.17r1188 (56, 57) and supplying these alignments to Pilon v1.23. BUSCO v5.1.2 (58) was run to assess the genic completeness of the assemblies using the hypocreales_odb10 database in genome mode, and other assembly metrics were calculated with QUAST v5.0.2 (59). EDTA v1.9.4 (60) and RepeatModeler v2.0.1 (61) were used to identify and annotate repetitive sequences within the polished assemblies. The coding sequences of BB strain ARSEF 8028 (33) were supplied to EDTA to ensure that gene sequences were excluded from the resulting libraries. All identified repeats were merged with the RepBase (62) database of eukaryotic repeat sequences (v23.12), and redundant sequences were removed using the cleanup_nested.pl script from EDTA (60). This custom repeat library was used as input for RepeatMasker v4.1.1 (63) to annotate and soft-mask repetitive sequences in the polished genome assemblies.
Phylogenetic Inference
Evolutionary relationships between the strains were inferred using RAxML v8.2.12 (64). The BB reference strain ARSEF 2860 (32) and B. pseudobassiana strain KACC 47484 (unpublished; GenBank accession: GCA_003267905.1) were also included in the analysis, as well as C. militaris strain CM01 (65), which was set as the outgroup with the -o option. Single-copy BUSCO sequences were used to generate the phylogeny as annotations were not available for the B. pseudobassiana genome assembly. Multiple sequence alignments were generated for the shared BUSCO sequences with MAFFT v7.475 (66), selecting the appropriate strategy automatically (--auto), and the resulting amino acid alignments were concatenated into a single matrix. The phylogeny was generated from a rapid Bootstrap analysis and search for the best-scoring Maximum Likelihood tree, using the PROTGAMMAAUTO model of amino acid substitution and 100 bootstrap replicates.
Genome Annotation and Orthogroup Inference
Genome annotation was performed using the MAKER (v2.31.10) pipeline (67), which combines ab initio gene predictions and homology evidence. SNAP v2006-07-28 (68), GeneMark.hmm-E v3.47 (69) and AUGUSTUS v3.3.3 (70) were used for ab initio gene prediction; SNAP and GeneMark were both trained on the UAMH 299 assembly, and Fusarium graminearum was used as the gene prediction species model for AUGUSTUS. CDS of BB strain ARSEF 8028 (33) were supplied as expressed sequence tag (EST) evidence and the UniProtKB/Swiss-Prot database of protein sequences (71) was used as protein homology evidence. The gene predictions were processed using Genome Annotation Generator v2.0.1 (72), which added start and stop codons and removed transcripts with introns shorter than 10 bp or coding sequences shorter than 90 bp. Annotations that were missing a start and/or stop codon were then manually removed. The quality of the filtered annotations was assessed with GeneValidator v2.1.10 (73) using the TrEMBL database (71), and these scores were considered while selecting genes of interest in later analyses. Functional annotations were assigned to protein sequences by running stages 4 and 5 of the EnTAP (74) pipeline (v0.10.7-beta), which included a similarity search to the Uniref90 (71) database followed by GO term and Pfam domain assignment using InterProScan v5.30-69.0 (75). Gene predictions that were not annotated by similarity search or gene family assignment were filtered out, and BUSCO (58) was run in protein mode to assess the completeness of the final gene annotations.
Orthogroups were inferred from the protein sequences using OrthoFinder v2.5.1 (34) with default parameters. The longest isoform of each gene was supplied for this analysis. OGs were functionally annotated with Gene Ontology terms and Pfam domains by selecting the most common terms and domains assigned to the genes included in each OG. Core orthogroups were identified as those that included one or more gene in every strain, and core, single-copy OGs were present in only one copy in each. Accessory OGs were identified as those present in two or more, but not all strains, and singleton OGs were those present only in one strain. Next, group-specific OGs were manually extracted by identifying those that included one or more genes from each strain of a given group, while not containing any genes from the strains in the other group. Functional relevance of orthogroup sets was assessed by performing GO enrichment analyses with clusterProfiler v3.18.1 (76). Benjamini & Hochberg’s false discovery rate (FDR) (77) was used to correct the p-values for multiple comparisons, and significantly enriched GO terms were identified at alpha = 0.05.
Biosynthetic Gene Cluster Mining
antiSMASH v5.1.2 (78) was used to identify biosynthetic gene clusters within the genomes using the --taxon fungi option. The MAKER annotations were supplied using the --genefinding-gff3 option. The runs included active site finder analysis (--asf), and clusters were compared against a database of antiSMASH-predicted clusters (--cb-general), known gene clusters from the MIBiG database (--cb-knownclusters) and known subclusters (--cb-subclusters). The fungal-specific Cluster Assignment by Islands of Sites (CASSIS) algorithm (79) was used to aid in the prediction of cluster regions by searching for conserved binding motifs in promoter regions using the --cassis option. The --cf-create-clusters option was used to find extra clusters, and Pfam and GO terms were mapped with --pfam2go.
RNA Sample Preparation for Transcriptomic Analysis
The starting fungal inoculum was reactivated from frozen mycelial stocks through spot inoculation at the centre of a PDA medium and incubated for 3–5 d at 25oC. A one-cm2 agar block was cored on to the mycelial culture and transferred to four different pigment inducing media: CDAYE (for red pigmentation), Malt Extract Agar (MEA; for yellow pigmentation), Potato Dextrose Agar (PDA; for yellow pigmentation), and 0.25xSaboraud Dextrose Agar (SDA; for induction of conidiation). After comparative quality analysis of the cultures (i.e., morphology, pigmentation response, and conidiation density), the conidia from 0.25x SDA were harvested, titered, and inoculated in 25 mL CDBYE broth medium in a 250 mL Erlenmeyer flask to a final concentration of 1.0 x 107 conidia/mL and incubated for 72 h at 28oC at 175 rpm. The active mycelial culture was inoculated (10% v/v inoculum) in a 100 mL CDBYE broth medium in a baffled 500 mL Erlenmeyer flask and incubated for 72–86 h at 28oC at 175 rpm. The mycelial slurries were harvested using a Stericup Quick Release-GP vacuum filtration system (0.22 µm, polyethersulfone membrane; Millipore-Sigma, USA). The mycelial mat was aseptically transferred to a pre-weighed 50 mL conical tube, flash frozen in liquid nitrogen, and stored at -80oC until processing. Quality control analysis was carried out, as described previously, for all the samples to determine bacterial contamination. The frozen mycelial mats were sent to the Michael Smith Laboratories, University of British Columbia, for RNA extraction.
RNA Extraction
Three biological replicates for each of the 8 isolates, for a total of 24 samples, were extracted for total RNA. Starting with 2–5 g (with the exception UAX-29, which required 7–9 g), mycelium was ground with liquid nitrogen in a mortar and pestle to a fine powder. Kolosova et al.’s RNA extraction protocol (80) was used to process 100–200 mg of ground tissue. Instead of drying the RNA pellet for three minutes at room temperature, the sample was spun for an additional 30 seconds, remaining liquid was removed with a micropipette tip and samples were air-dried for one minute. The final pellet was resuspended in 30 µl of Nuclease-Free water. Total RNA concentration was determined using a NanoDrop 1000 (ThermoFisher Scientific) and assessed for quality on an Agilent 2100 Bioanalyzer and Agilent RNA 6000 Nano Kit LabChips. Total RNA (37.5 ng/µL) was sent to Canada’s Michael Smith Genome Sciences Centre for sequencing.
Transcriptome Sequencing
Qualities of total RNA samples were assessed using an Agilent Bioanalyzer RNA Nanochip and arrayed into a 96-well plate (ThermoFisher Scientific). Polyadenylated RNA was purified using the NEBNext Poly(A) mRNA Magnetic Isolation Module [E7490L] (New England Biolabs, Ipswich, MA, USA) from 1000 ng total RNA. Messenger RNA selection was performed using NEBNext Oligo d(T)25 beads (New England Biolabs) incubated at 65°C for five minutes, followed by snap-chilling at 4°C to denature RNA and facilitate binding of poly(A) mRNA to the beads. mRNA was eluted from the beads in NEBNext Tris Buffer from the NEBNext Poly(A) Magnetic Isolation Kit (New England Biolabs) and incubated at 80°C for two minutes, then held at 25°C for two minutes. RNA binding buffer was added to allow the mRNA to re-bind to the beads, mixed 10 times and incubated at room temperature for five minutes. The sample plate was placed on the magnet and the supernatant discarded. The mRNA bound beads were washed twice, then cleared again on magnet. The supernatant was again discarded, and mRNA was eluted from the beads in 20 µL Tris buffer incubated at 80°C for two minutes. mRNA was transferred to a new 96-well plate.
First-strand cDNA was synthesized from heat-denatured, purified mRNA using a Maxima H Minus First Strand cDNA Synthesis kit (ThermoFisher Scientific) and random hexamer primers at a concentration of 200 ng/µL along with a final concentration of 40 ng/µL Actinomycin D, followed by PCR Clean DX (Aline Biosciences, Woburn, MA, USA) bead purification on a Microlab NIMBUS robot (Hamilton Company, Reno, NV, USA). The second strand cDNA was synthesized following the NEBNext Ultra Directional Second Strand cDNA Synthesis protocol (New England Biolabs) that incorporates dUTP in the dNTP mix, allowing the second strand to be digested using USER™ enzyme (New England Biolabs) in the post-adapter ligation reaction and thus achieving strand specificity.
cDNA was fragmented by Covaris LE220 sonication to achieve 250–300 bp average fragment lengths. The paired-end sequencing library was prepared following the Canada’s Michael Smith Genome Sciences Centre strand-specific, plate-based library construction protocol on a Microlab NIMBUS robot (Hamilton Company). Briefly, the sheared cDNA was subject to end-repair and phosphorylation in a single reaction using an enzyme mix (New England Biolabs) containing T4 DNA polymerase, Klenow DNA Polymerase and T4 polynucleotide kinase, incubated at 20°C for 30 minutes. Repaired cDNA was purified in 96-well format using PCR Clean dX beads (Aline Biosciences) and 3’ A-tailed (adenylation) using Klenow fragment (3’ to 5’ exo minus) and incubated at 37°C for 30 minutes prior to enzyme heat inactivation. Illumina TruSeq adapters were ligated at 20°C for 15 minutes. The adapter-ligated products were purified using PCR Clean DX beads, then digested with USER™ enzyme (1U/µL) (New England Biolabs) at 37°C for 15 minutes followed immediately by 10 cycles of indexed PCR using NEBNext Ultra II Q5 Master Mix (New England Biolabs) and Illumina’s primer set. The PCR products were purified and size-selected twice using a 1:1 PCR Clean DX beads-to-sample ratio, and the eluted DNA quality was assessed with Caliper LabChip GX for DNA samples using the High Sensitivty Assay (PerkinElmer, Waltham, MA, USA) and quantified using a Quant-iT dsDNA High Sensitivity Assay Kit on a Qubit fluorimeter (Invitrogen) prior to library pooling and size-corrected final molar concentration calculation for Illumina HiSeq sequencing with paired-end 150 base reads.
The samples were submitted to the NCBI SRA under BioProject accession PRJNA877233. The corresponding BioSample accessions are presented in Table 5.
Table 5
BioSample accession information for Beauveria bassiana samples under investigation.
Strain | Sample Title | Accession | Sample Name |
UAMH 299 | Beauveria bassiana strain UAMH 299 | SAMN30686972 | Bb_UAMH_299 |
UAMH 299-UVR | Beauveria bassiana strain UAMH 299 (UV-resistant derivative) | SAMN30686973 | Bb_UAMH_299-UVR |
UAMH 298 | Beauveria bassiana strain UAMH 298 | SAMN30686974 | Bb_UAMH_298 |
UAMH 298-UVR | Beauveria bassiana strain UAMH 298 (UV-resistant derivative) | SAMN30686975 | Bb_UAMH_298-YVR |
UAMH 1076 | Beauveria bassiana strain UAMH 1076 | SAMN30686976 | Bb_UAMH_1076 |
UAX-29 | Beauveria bassiana strain UAX-29 | SAMN30686977 | Bb_UAX-29 |
UAMH 4510 | Beauveria bassiana strain UAMH 4510 | SAMN30686978 | Bb_UAMH_4510 |
110.25 | Beauveria bassiana strain 110.25 | SAMN30686979 | Bb_110.25 |
Differential Expression and Co-Expression Analysis
Gene-level expression was quantified against CDS from the reference BB strain ARSEF 2860 (32) with Salmon v1.5.2 (81). Salmon was run in quasi-mapping mode and additionally corrected for sequence-specific and fragment-level GC biases using the --seqBias and --gcBias options. Differential expression analysis was carried out to using DESeq2 (82). Gene expression was compared between the red and non-red strains, with the non-red strains used as the reference level. Differential expression (DE) was tested using a log2 fold change (LFC) threshold of 1 and altHypothesis=”greaterAbs” to identify upregulation and downregulation, and significantly DE genes were selected at FDR < 0.05. GO enrichment analysis was performed using clusterProfiler (76). The ARSEF 2860 annotations were obtained for this analysis from AnnotationHubv2.22.1 (83) under the record AH86840, and significantly enriched terms were selected at FDR < 0.05. Genes with mean normalized count values of 0 were excluded from the background gene set for the GO enrichment analysis.
Differential co-expression was assessed using the DGCA R package (v1.0.2) (84). Variance stabilizing transformation from DESeq2 was applied to the raw count data with blind = F, and genes in the lowest 25th percentile of variant were filtered out. Co-expression was calculated between gene pairs consisting of all filtered genes and OpS1 (BBA_08179) and OpS3 (BBA_08181), and differential correlation was identified between red and non-red strains. P-values were adjusted using FDR and significantly co-expressed gene pairs were identified between red and non-red strains at FDR < 0.05.