Whole Genome Sequencing and Expression Analysis of ToxA in Bipolaris Sorokiniana Provides Discernment of Pathogenicity Causing Spot Blotch of Wheat

Background: Spot blotch disease of wheat caused by Bipolaris sorokiniana Boerma (Sacc.) is an emerging problem in South Asian countries. In this study, whole genome of highly virulent isolate of Bipolaris sorokiniana (BS112) was sequenced, pathogenicity related gene(s) were identied and role of ToxA gene in spot blotch disease development was established. Results: Bipolaris sorokiniana isolate BS112 infecting wheat was sequenced using hybrid assembly approach. The assembly size of the genome was 35.64Mb (GenBank accession number RCTM00000000) with GC content of 50.2%, providing coverage of 97.6% on reference ND90Pr genome. Average gene density predicted was 250-300 genes/Mb. A total of 235 scaffolds were obtained using pyScaf assembler with N 50 of 16,54,800 bp. In addition, 152 transcription factors involved in various biological processes were identied and a total of 682 secretory proteins were predicted using secretome analysis. ToxA gene (535bp) was analyzed and identied in the genome of B. sorokiniana which revealed 100% homology with ToxA gene of Pyrenophora tritici repentis. Further, ToxA gene was amplied, sequenced and validated in the 39 isolates of B. sorokiniana which conrmed the presence of ToxA gene in all the isolates of B. sorokiniana. All these ToxA sequences were submitted in NCBI database (MN601358-MN601396). As ToxA gene interacts with Tsn1 gene of host, 13 wheat genotypes were evaluated for the Tsn1 gene and ve genotypes (38.4%) were found to be Tsn1 positive with more severe necrotic lesions compared to Tsn1 negative wheat genotypes. In vitro expression analysis of ToxA gene in B. sorokiniana isolate (BS112) using qPCR revealed maximum upregulation (14.67 fold) at 1 st day after inoculation (DAI). Further, in planta expression analysis of ToxA gene in Tsn1 positive and Tsn1 negative genotypes, Agra local and Chiriya 7 respectively was also conducted. Results revealed maximum expression (7.89 fold) of ToxA gene in Tsn1 positive genotype, Agra local at 5 th DAI compared to Tsn1 negative genotype Chiriya 7 which showed minimum expression (0.048 fold) at 5 th DAI. Conclusions: Full genome of B. sorokiniana was sequenced; secreted proteins and virulence genes were identied in the genome. ToxA gene was validated in thirty nine isolates of B. sorokiniana. In planta ToxA-Tsn1 interaction studies established that spot blotch disease is more severe in Tsn1 positive genotypes. This genomic resource will provide a new insight into better understanding and management of spot blotch disease and B. sorokiniana of wheat.

biological processes were identi ed and a total of 682 secretory proteins were predicted using secretome analysis. ToxA gene (535bp) was analyzed and identi ed in the genome of B. sorokiniana which revealed 100% homology with ToxA gene of Pyrenophora tritici repentis. Further, ToxA gene was ampli ed, sequenced and validated in the 39 isolates of B. sorokiniana which con rmed the presence of ToxA gene in all the isolates of B. sorokiniana. All these ToxA sequences were submitted in NCBI database (MN601358-MN601396). As ToxA gene interacts with Tsn1 gene of host, 13 wheat genotypes were evaluated for the Tsn1 gene and ve genotypes (38.4%) were found to be Tsn1 positive with more severe necrotic lesions compared to Tsn1 negative wheat genotypes. In vitro expression analysis of ToxA gene in B. sorokiniana isolate (BS112) using qPCR revealed maximum upregulation (14.67 fold) at 1 st day after inoculation (DAI). Further, in planta expression analysis of ToxA gene in Tsn1 positive and Tsn1 negative genotypes, Agra local and Chiriya 7 respectively was also conducted. Results revealed maximum expression (7.89 fold) of ToxA gene in Tsn1 positive genotype, Agra local at 5 th DAI compared to Tsn1 negative genotype Chiriya 7 which showed minimum expression (0.048 fold) at 5 th DAI.
Conclusions: Full genome of B. sorokiniana was sequenced; secreted proteins and virulence genes were identi ed in the genome. ToxA gene was validated in thirty nine isolates of B. sorokiniana. In planta ToxA-Tsn1 interaction studies established that spot blotch disease is more severe in Tsn1 positive genotypes. This genomic resource will provide a new insight into better understanding and management of spot blotch disease and B. sorokiniana of wheat.

Background
Wheat is considered as staple and a leading source of a protein in human food. Wheat production throughout North and South Asia is affected by several abiotic and biotic stresses. Among biotic stresses, the important wheat disease throughout North and South Asia is spot blotch, caused by Bipolaris sorokiniana. Various other diseases are also caused by this fungus like foliar blight, common root rot, black point and seedling blight of barley and wheat [1,2]. However, spot blotch of wheat is majorly the most important diseases which is prevalent in high temperature and humidity areas [3]. Almost 9 million ha of land is affected by spot blotch in India [4] causing 14.9% (approx.) of yield loss [5,6]. B. sorokiniana is an aggressive fungal pathogen that causes small, dark brown colored lesions on leaves with the length of 1-2 mm without chlorotic margin. These lesions become dark brown in 4 to 5 days and extend to several centimeters in susceptible genotypes inducing leaf abscission. Environmental factors contribute an important role in disease severity. In India, late sown wheat crop is highly vulnerable to this disease due to temperature which ranges between 26-30°C [7].
In the past, it is believed that spot blotch is caused due to compound of three species i.e. Alternaria triticina, Helminthosporium sativum and Pyrenophora tritici-repentis [8]. But later study revealed that it is mainly caused by a hemibiotroph aggressive pathogen B. sorokiniana. At the seedling stage this disease appears and increases as plant grows with more severity. As this pathogen infects the plant, conidium germination was observed on the leaf surfaces followed by a germ tube to form a rudimentary appresorium. This infection hyphae either directly penetrate the cuticle and epidermal cell wall or through stomata [9,10]. Many non-selective toxins and hydrolytic enzymes are produced by B. sorokiniana which form lesions on the leaf, leading to leaf abscission [11,12].
Metabolites produced by B. sorokiniana are considered as toxins which play an important role in disease development. During 1990, ToxA was discovered in Parastagonospora nodorum, causing septoria nodorum blotch (SNB) which horizontally got transferred to Pyrenophora tritici repentis causing tan spot [13,14]. ToxA was reported to be unique to Pyrenophora tritici repentis but not conserved in its genome [15]. Recently, ToxA gene was discovered in Australian B. sorokiniana isolates [16] and it was also exists in the B. sorokiniana population in the winter wheat region of the United States [17]. According to Gupta et al. [18] effector triggered susceptibility was identi ed in wheat-B. sorokiniana pathosystem where ToxA (virulence gene) utilizes Tsn1 (sensitivity gene) in the host, which helps fungus to cause disease by invading the host. Necrosis was induced on wheat leaves of ToxA sensitive wheat genotypes (possessing the Tsn1 susceptibility gene) [19]. According to McDonald et al. [16] also presence of sensitivity gene Tsn1 in wheat generally helps a ToxA positive pathogen to cause spot blotch disease and ToxA functions in a Tsn1 dependent manner. In general, the ToxA-Tsn1 system is a prototype of an inverse gene-for-gene relationship [20].
Present study was undertaken to analyze the genome, its function and to provide new insights into the genome of B. sorokiniana Indian isolate BS112 (collected from BHU, Uttar Pradesh; Accession number: MN601379). Here, three genome sequences of Bipolaris species were compared with the genome sequence of B. sorokiniana strain BS112. Further, genes involved in carbohydrate-active enzymes, pathogenesis and secondary metabolism have been identi ed and validated. The genome was also analyzed for ToxA and its copy numbers in the genome.    Gene prediction and gene ontology annotation A total number of 10,460 protein-coding genes were identi ed with an average gene size of 435-545bp. Total proteins annotated against fungal database was 10,141 and unannotated proteins were 391 (Additional le 1). An average gene density was 250-300 genes/ Mb in the genome of B. sorokiniana and 8,506 bp and 35 bp were the maximum and minimum sizes of the genes, respectively. To classify the putative pathogenicity-related genes, 3,627 genes were annotated against the pathogen-host interaction database (PHI database). Based on the homology of pathogenicity proteins, 1,475 genes were related to reduced virulence, 1,281 genes to unaffected pathogenicity, 264 genes to loss of pathogenicity, 174 genes related to lethal and 117 genes related to unaffected pathogenicity reduced virulence ( Fig. 1).
In gene ontology (GO) from a total of 2,788 protein-coding genes 1,024 genes were assigned to 'Biological Process' which included 217 genes for transmembrane transport, 187 genes for carbohydrate metabolic process, 182 genes for metabolic process (Fig. 2a)

ToxA -Tsn1 gene relationship in wheat genotypes
In the host, the presence of Tsn1 (xfcp623), a sensitivity gene; helps a ToxA positive pathogen to cause more severe necrotic lesions on wheat genotypes. To predict allelic status at Tsn1 (400bp amplicon) locus using PCR ampli cation, thirteen genotypes of wheat were screened; from which 5 wheat genotypes (WH 542, WL 711, HD 29, PBW 343 and Agra local) (38.4%) were found to be Tsn1 positive, whereas remaining 8 lacked Tsn1 and instead carry tsn1 (Fig. 10a). Further, pathogenicity assay of B. sorokiniana (BS112) was tested on these thirteen wheat genotypes to identify correlation between Tsn1 (presence or absence) and necrotic lesions formation. Average disease index was calculated on the basis of number and length of lesions on the leaf of different wheat genotypes (Fig. 10b). Genotypes viz. WH 542, WL 711, HD 29, PBW 343 and Agra local showed severe necrotic lesions on leaves when inoculated with B. sorokiniana while Tsn1 negative wheat genotype showed no necrotic lesion formation. This con rmed that ToxA has susceptible reaction phenotype on wheat genotypes harboring Tsn1 gene [Additional le 8: Fig. S1].

Relative expression analysis ToxA gene under in vitro and in planta conditions
To evaluate host gene expression during fungal infection, relative qPCR with an EFN-1 alpha gene as internal control was performed to determine time-dependent effect of ToxA on wheat genotypes. In vitro expression analysis of ToxA gene in B. sorokiniana isolate (BS112) collected at different time intervals, using qPCR revealed maximum upregulation (14.67 fold) at 1st day after inoculation (DAI), followed by 2nd DAI (11.83 fold), then gradually expression decreased at 3, 4 and 5 DAI in minimal basal medium (Fig. 11a). In planta expression analysis via inoculated leaf collected at different time intervals from two genotypes was performed using qPCR. The maximum expression (7.89 fold) was observed at 5th DAI in susceptible cultivar (Agra local), while minimum expression (0.048 fold) was found in resistant cultivar (Chiriya 7) at 5th DAI (Fig. 11b).

Discussion
Wheat is the most vital food-grain in the Northern and North-western parts of India. After China, India is the second chief producer of wheat in the world and accounts for 8.7 per cent of the world's total production of wheat [23]. It is rich in proteins, carbohydrates and vitamins and provides balanced food. High yield loss in wheat is mainly due to many fungal diseases like wheat rusts, spot blotch, head blight etc. Over the next several decades, due to predicted high global population growth rate, it would be a challenge to meet this rocketing demand of wheat. Management of fungal diseases and development of resistant genotypes of wheat is the key component to meet this challenge [24]. Fungal disease management and to understand the zestful nature of genome of fungal pathogen, whole genome sequencing is of great signi cance. Genome sequencing of fungal pathogens show extensive variations in genome structure, its composition between species and also between isolates of same species. Genomic sequencing is antecedent for many strategies and alternatives for plant defense reactions against fungal pathogens [25].
Using next generation sequencing technology we have produced a draft genome of B. sorokiniana and a total sequence assembly size of 35.64Mb was predicted [21], which is very similar to B. Further, study revealed 138 proteins for ankyrin repeat in Pfam domain, which is highly involved in diverse set of cellular functions. Four Ankyrin repeats were present in ToxE, which is responsible for the expression of Tox2 gene present in Bipolaris zeicola (highly virulent on maize genotypes) [27].
To penetrate the rigid barriers of plant cell wall, many types of CAZymes (Carbohydrate active enzymes) were produced by fungi. In this study, we reported the functional identi cation of ToxA gene in different Indian populations of B. sorokiniana. ToxA gene in a global population dataset was found to be drastically different depending upon the area from which population was sampled. Earlier, for Australian populations of P. nodorum and P. tritici-repentis, ToxA was present in all the tested isolates [29,30]. Recently, Navathe et al. [20] reported 70% of ToxA positive isolates in tested Indian population of B. sorokiniana, whereas our results revealed that ToxA gene existed (100%) in all the Indian isolates of B. sorokiniana taken for the study. In the present study, the gene ToxA was PCR ampli ed in all the 39 isolates of B. sorokiniana and the length of the amplicon in all the isolates was uniform (535bp). The sequencing data suggested that, there is a high degree of similarity of ToxA gene present in B. sorokiniana, Pyrenophora tritici and Phaeosphaeria species. The similarity between these three species can be best illustrated by horizontal gene transfer [13]. The horizontal gene transfer may be due to hybridization between the various fungal species [16].
The exact sequence match of B. sorokiniana (BS112) and other two species reveals a recent global migration of these strains.
The expression of ToxA gene investigated using qPCR, measure the level of B. sorokiniana in leaf tissue at different time intervals displaying disease symptoms. Gene expression levels were compared by C T (threshold) values for each runner gene. The stability of gene expression was re ected through coe cient of variation of C T values, wherein higher the C T values lower will be the gene expression levels and vice versa. Pathogenicity assay and PCR ampli cation con rmed that the isolate carrying ToxA caused severe necrotic lesions on wheat genotypes having the sensitivity gene Tsn1. In contrast, it caused less or no lesions to wheat genotypes lacking Tsn1. Quantitative gene expression analysis and pathogenicity assay revealed that necrotic symptoms observed on Agra local genotype leaves was due to high expression of ToxA gene, which gradually increased and caused leaf abscission, whereas, on chiriya 7 its effect is reverse. Tsn1 is the part of complex network of resistance / susceptibility to spot blotch. According to Navathe et al., ToxA-Tsn1 have gene to gene relationship; similarly our study also con rms ToxA positive isolates showing more severe necrosis on wheat genotype harboring Tsn1 gene [20]. Recently, Lei et al.
also con rmed the ToxA-Tsn1 relationship through culture ltrate in ltration method. However, the appearance of necrosis on the wheat leaf is not only induced by ToxA but can be through some other toxins present in culture ltrate or crude extract [31]. Study of ToxA-Tsn1 interaction is very useful for breeding program, which are highly prone to spot blotch. Wheat breeders may avoid the genotype with Tsn1 allele and retain genotypes with tsn1 allele.

Conclusions
In this study, detailed analysis of genome of B. sorokiniana (BS112) was conducted, which lead to the identi cation of unique SSRs, protein domains, secretory proteins, establishment of phylogenetic relationship among different Bipolaris species and understanding the genes involved in host-pathogen interactions at molecular level. This study, further stated the ToxA-Tsn1 interaction in Indian wheat genotypes and con rmed that spot blotch is more severe in Tsn1 positive wheat genotypes.

Fungal-collection, isolation and DNA extraction
Wheat infected leaves were collected from different region of India. For the isolation of the pathogen, infected leaves were surface sterilized and necrotic lesions were cut into smaller fragments. These fragments were washed with 0.1% sodium hypo chlorite (NaOCl 2 ) solution, followed by washing with water 2 times. To induce sporulation, these fragments were placed on PDA (potato dextrose agar) under a 12-h photoperiod at room temperature (RT). Several single spores from each of the PDA plates transferred to another PDA plates and allow to grow at 25°C. Meanwhile, these spores are observed in microscope and fungus was identi ed based on conidia and its morphology [32].
Thirty nine isolates of B. sorokiniana were established from infected wheat leaf samples collected from different regions of India. These cultures were grown in potato dextrose broth (PDB) and incubated at 25 0 C with periodical shaking. After 25 days the mycelial mats (0.2 g) from each isolates was harvested aseptically and snap frozen in liquid nitrogen. Frozen mycelia were subjected to DNA extraction using CTAB method [33]. Then, the quantity and quality were assessed using a Qubit® 2.0 Fluorometer (Thermo Fisher Scienti c, Wilmington, USA) and agarose gel electrophoresis respectively. Genomic DNA of BS112 (ITS NCBI accession-KU201275) was sent to Genotypic Technology (P) Ltd. for genome sequencing using hybrid genome assembly through Ilumina, Nano pore and Ion-torrent platforms.
Genome sequencing and hybrid assembly The genome of B. sorokiniana was sequenced by using Illumina Hiseq, Oxford Nanopore sequencing and Ion-Torrent platform technologies. The fungal hybrid assembly is the main aim of the project. The Illumina reads were pre-processed using Trim-galore [34], and Nano pore fast5 data were base called using Albacore [35]. Hybrid genome assembly was created using these three types of sequenced reads. Kmer analysis will not require in hybrid assembly because the internal long read algorithm was based on overlap layout consensus (OLC). For hybrid assembly MaSuRCA tool [36] is used and output of the hybrid assembly of genome is 35.64Mb. Later using pyScaf, the contigs were processed [37]. The paired end library data was generated using Illumina HiSeq platform.
Gene prediction, annotation and pathway analysis Functional annotation of genes were performed using BLASTx tool [39]. The predicted proteins were similarity searched against fungal protein database using ncbi-blast-2.2.29+ blastp program with an evalue of 1e-5 for gene ontology and annotation. Around 98% of predicted genes were annotated against the protein database. The proteins were annotated against all Viridi plantae kingdom protein sequences (from Uniprot Protein Database). Further analysis, those proteins with more than 30% identity as cut off were taken. The gene annotation was done by fungal hybrid assembly in biological, chemical and molecular functions. The pathway analysis of the predicted genes was performed through KEGG database [40].

Phylogenetic analysis of B. sorokiniana genomes
Based on whole-genome alignment, Progressive Mauve version 20150213 build 0 with default parameters was used to execute phylogenetic analysis [41]. sorokiniana/ Cochliobolus sativus query sample (BS112) and one isolate of Pyrenophora triticirepentis (outgroup) were taken for the multiple genome alignment. Through neighbor joining tree, comparative analysis was performed on all six genomes. The guide tree was an output of multiple sequence alignment in Mauve. The phylogenetic trees were constructed using the guide tree in phylogeny.

Comparative genome annotation of orthologous gene families
The orthologous gene families among the Bipolaris species GCA_000527765.1, GCA_000523455.1, GCA_000523435.1 were identi ed using OrthoVenn. Comparison among these species revealed 8674 gene families in common.

SNP (Single Nucleotide Polymorphism) and SSR (Simple Sequence Repeat) analysis
SNP prediction was processed using snpEff tool with Illumina Hiseq data [42]. Alignment was done using minimap2 [43] and variant detection was done using Sam-tools [44]. Through MicroSatellite identi cation tool (MISA) tool with default parameters simple sequence repeats were identi ed [45].
Protein family classi cation Pathogenicity assay were tested on 3 week old plant of thirteen different wheat genotypes as mentioned above. The inocula was prepared from BS112, cultured in sorghum grain under aseptic conditions and after 25 days, the inoculum was crushed in water and lter through muslin cloth having a spore density of 1 × 10 6 spores/mL, for uniform spraying Tween 20 was added to spore suspension and the inoculum was prepared. Plants were kept in poly house in glass chamber were the temperature was maintained at 25°C, followed by >80% of relative humidity. Spore suspension of BS112 was sprayed on the ag leaf of all genotypes in four plant each. Photographs were taken at 4 days of post-inoculation and disease severity (%) was calculated as per the scale; 0=free of spots; 1= up to 5% leaf area covered with necrotic spots; 2= 6-20% of the leaf area covered; 3= 21-40 % of the leaf area covered; 4= 41-60 % of the leaf area covered; 5= spots inclusion more than 60 % of the leaf area tangled.
Average Disease Index (ADI) was calculated using the formula: ADI = ((sum of rating of each leave) / (total leaf*5)/100  Figure 1 Whole genome functional annotation of Bipolaris sorokiniana predicted genes using pathogen-host interaction (PHI) database.