The characterization and candidate gene isolation for a novel male-sterile mutant ms40 in maize

A novel genic male-sterile mutant ms40 was obtained from EMS treated RP125. The key candidate gene ZmbHLH51 located on chromosome 4 was identified by map-based cloning. This study further enriched the male sterile gene resources for both production applications and theoretical studies of abortion mechanisms. Maize male-sterile mutant 40 (ms40) was obtained from the progeny of the ethyl methanesulfonate (EMS) treated inbred line RP125. Genetic analysis indicated that the sterility was controlled by a single recessive nuclear gene. Cytological observation of anthers revealed that the cuticles of ms40 anthers were abnormal, and no Ubisch bodies were observed on the inner surface of ms40 anthers through scanning electron microscopy(SEM). Moreover, its tapetum exhibited delayed degradation and then blocked the formation of normal microspores. Using map-based cloning strategy, the ms40 locus was found to locate in a 282-kb interval on chromosome 4, and five annotated genes were predicted within this region. PCR-based sequencing detected a single non-synonymous SNP (G > A) that changed glycine (G) to arginine (A) in the seventh exon of Zm00001d053895, while no sequence difference between ms40 and RP125 was found for the other four genes. Zm00001d053895 encodes the bHLH transcription factor ZmbHLH51 which is localized in the nucleus. Phylogenetic analysis showed that ZmbHLH51 had the highest homology with Sb04g001650, a tapetum degeneration retardation (TDR) bHLH transcription factor in Sorghum bicolor. Co-expression analysis revealed a total of 1192 genes co-expressed with ZmbHLH51 in maize, 647 of which were anther-specific genes. qRT-PCR results suggested the expression levels of some known genes related to anther development were affected in ms40. In summary, these findings revealed the abortion characteristics of ms40 anthers and lay a foundation for further studies on the mechanisms of male fertility.


Introduction
Maize is one of the most important crops that is widely cultivated in the world and is one of the earliest crops to utilize heterosis. The demand for maize hybrids seeds in practice increases every year, and artificial emasculation remains the most common method, which is time-consuming and laborious. Moreover, the purity of hybrid seeds is difficult to guarantee. However, using male sterile lines in maize hybrid seeds production will greatly address these problems. Maize male sterility is divided into cytoplasmic male sterility (CMS) and genic male sterility (GMS). There are some obvious problems in the application of CMS, such as the instability of sterility and difficulty in finding strong and stable restorer lines. For GMS, it is difficult to find complete maintainer lines, making it hard to apply them directly to Xiaowei Liu and Yujing Yue contributed equally to this work. hybrid seed production in maize. The emergence of seed production technology (SPT) brings hope for applying GMS in hybrid seed production (Fox et al. 2017). Therefore, the breeders have paid more attention to GMS genes and an increasing number of such genes have been cloned. However, few male-sterile mutants have been created by ourselves. Therefore, it is necessary to create male-sterile mutants with independent intellectual property rights based on our excellent maize inbred lines.
To date, approximately 19 GMS genes have been successfully cloned in maize, which encode different protein types, including secretory proteins, lipid transporters, redox proteins, enzymes, and transcription factors. MSCA1(MULTIPLE ARCHESPORIAL CELLS1) encodes a plant-specific glutathione reductase gene, and the mutant msca1 has a deleted GSH binding site, which may impact the initiation of archesporial cells (Albertsen et al. 2009). Its homologous genes, OsTDL1A and AtTPD1, have been reported to be related to the anther development (Wang et al. 2012). Ms6021, Ms33, Ms30 and IPE1 (IRREGU-LAR POLLEN EXINE1) are all functional proteins and participate in lipid or fatty acid metabolism Chen et al. 2017;Tian et al. 2017;Xie et al. 2018). Both Ms26 and APV1 (ABNORMAL POLLEN VACU-OLATION1) encode cytochrome P450 monooxygenases (Djukanovic et al. 2013;Somaratne et al. 2017), MS45 encodes hydroxyproline-rich glycoprotein family protein (Cigan et al. 2001), all of which are required for the formation of pollen exine and anther cuticles in maize. Some transcription factors have been reported to be associated with genic male sterility in maize. IG1(INDETERMINATE GAMETOPHYTE1) encodes a LOB domain protein that can regulate the proliferative phase of female gametophyte development (Evans 2007). OCL4 (OUTER CELL LAYER) encodes an HD-ZIP transcription factor that plays a major role in trichome differentiation and division of the anther cell wall in maize (Elena et al. 2013). MS9 encodes an R2 / R3 plant-specific MYB transcription factor (Albertsen et al. 2016). MS7 encodes a PHD-finger transcription factor, which was used for hybrid seed production by a multicontrol sterility system . Both Ms23 and Ms32 encode bHLH transcription factors responsible for tapetal development (Moon et al. 2013;Nan et al. 2017). The bHLH transcription factors (TFs) are a large family in flowering plants, there are 213 members have been annotated in maize (Lin et al. 2014). Among them, Ms23 encodes the bHLH16 transcription factor (Nan et al. 2017), which plays a major role in the differentiation of the endothecium and tapetum cells of anthers. MS32 encodes the bHLH66 transcription factor (Moon et al. 2013) and is specifically expressed in anthers at the premeiotic stage. Moreover, MS32 can interact with the protein encoded by MAC1 to regulate the pericytosis of L2 layer cells and the differentiation of anther sporogony, thus affecting anther development. Although these bHLH transcription factors have been cloned, the regulatory mechanisms for pollen abortion have not been elucidated clearly. The discovery of other bHLH transcription factors controlling maize malesterile mutants may be beneficial for determining the regulatory relations between these bHLH transcription factors.
Generally, it is long progress that transferring a specific male-sterile gene from one genetic background into another elite inbred line by the traditional breeding method, so the best strategy is to create a male-sterile mutant based on an elite inbred line background with a single base change, which can effectively accelerate the application of the GMS gene. The maize inbred line RP125 cultivated by Sichuan Agriculture University is widely planted in the southwest of China with a high combining ability, high yield, high resistance to northern leaf blight and southern leaf blight, moderate resistance to sheath blight and other major diseases in the southwest corn production area, and efficient utilization of phosphorus, making it become one of the most popular parents in southwestern China in the twenty-first century.
In this study, we found a no-pollen male-sterile mutant, ms40, derived from maize inbred line RP125 by EMStreated pollen. Cytological observation showed that the tapetum of ms40 anthers was abnormal, and defects in Ubisch bodies and pollen exine were observed. The sterile gene of ms40 was found to locate within a 282-kb interval on chromosome 4 by map-based cloning, and Zm00001d053895 was found to be the key candidate gene. This study provides a new genetic resource not only for the application of GMS in hybrid seed production but also for the interpretation of the regulatory mechanism of maize anther development.

Plant materials
In the spring of 2015, the maize inbred line RP125, cultivated by Sichuan Agriculture University, was planted in the experimental field of Sichuan Agriculture University in Sichuan. Next, the pollen was treated by ethyl methanesulfonate (EMS) and then self-pollinated to produce M 1 seeds. Then, the M 1 seeds were planted in the Yunnan experimental field of Sichuan Agriculture University in the autumn of 2015. Self-pollination was conducted for M 1 plants and M 2 seeds were obtained. M 2 seeds were planted in the Sichuan experimental field in the spring of 2016. Among the M 2 , a male-sterile mutant was found, termed ms40, which was pollinated with RP125 pollen to obtain (ms40 × RP125)F 1 seeds. Inbred lines B73 and Mo17 were also used in this study.

Phenotype identification and genetic analysis
The (ms40 × RP125)F 1 seeds were planted and their tassels were fertile completely. Self-pollination was conducted with (ms40 × RP125)F 1 and the (ms40 × RP125)F 2 seeds were obtained. The (ms40 × RP125)F 2 was planted in both the Sichuan and Yunnan experimental fields of Sichuan Agriculture University. Fertility identifications for the individuals of F 2 population at the adult stage were performed. At the meantime, pollen grains were stained with 1% (m/v) I 2 -KI solution to evaluate the fertility of anthers. If the sterility phenotype of ms40 can be inherited stably, then the inbred lines B73 and Mo17 were used as male parents to construct (ms40 × B73)F 2 and (ms40 × Mo17)F 2 populations for genetic analysis, and the Chi-square test was used for phenotype segregation analysis.
A digital camera M3 (Canon, Japan) and stereomicroscope SZX16 (Olympus, Japan) were used to take photographs of the plants and anthers. The stainability of pollen grains was observed and photographed by a light microscope DM2000 (Leica, Germany).

Cytological observations of anther development
For scanning electron microscopy (SEM) observation, the mature anthers of fertile and male-sterile plants from the (ms40 × RP125)BC 1 F 1 population were dissected and fixed with glutamic dialdehyde. Then, the samples were dried for approximately 18 h with a freeze dryer Freezing 2.5 (Labconco, USA) and examined under a scanning electron microscope Inspect F50 (FEI, USA).
For semi-thin section observation, the anthers of fertile and male-sterile plants at different developmental stages were fixed in formaldehyde-acetic acid-ethanol (FAA) overnight and dehydrated by gradient concentrations of ethanol. Then the anthers were infiltrated with a mixed gradient solution of ethanol and Hardener II 7100 solution (Technovit, Germany) and embedded in spur resin. Slicing was performed by a slicer DM2255 (Lycra, Germany), and slices were stained with 0.1% (m/v) toluidine blue solution. A microscope DM2000 (Lycra, Germany) was utilized for the observation of semi-section and photographs in this study.
Programmed cell death (PCD) along with internucleosomal cleavage of chromosomal DNA was examined. Terminal deoxynucleotidyl transferase-mediated dUTP nick-end labelling (TUNEL) assay was performed using the DeadEnd Fluorometric TUNEL system (Promega, USA) for tapetal PCD analysis in ms40. Anthers of fertile and male-sterile plants.
at the meiosis stage, uninucleate stage and binuclear stage were fixed for 24 h in electron microscope fixative (Servicebio, China). Then, the paraffin sections of anthers were made. According to the supplier's instructions, a TUNEL apoptosis detection kit was used to perform in situ nick-end labelling of nuclear DNA fragments in the dark at 37 °C for 1 h. Samples were analysed with a fluorescence microscope BX51 (Olympus, Japan).

Map-based cloning of the ms40 male-sterile gene
The (ms40 × B73)F 2 population was applied to map the sterile gene of ms40, and genomic DNA was extracted using the CTAB (hexadecyl trimethyl ammonium bromide) method (Luan et al. 2008) with minor modifications from the original method. The bulk-segregant analysis (BSA) method was implemented, and the fertile DNA pool and the male sterile DNA pool were constructed with twenty fertile and twenty male-sterile plants from the (ms40 × B73)F 2 population by mixing equally. The 134 InDel markers uniformly covering 10 chromosomes of maize were developed based on the differences between genome sequences of RP125 and B73, which were used for scanning the polymorphisms between ms40 and B73, as well as fertile and sterile pools. Next, polymorphic markers were used to examine the genotype of 115 male-sterile individuals from the (ms40 × B73)F 2 population to judge whether the sterile phenotype and the polymorphic markers were linked. Based on the primary mapping region, four novel InDel markers were developed, and 1230 male sterile plants from the larger (ms40 × B73)F 2 population were employed for fine mapping. All information on markers is provided in Table S1.

Key candidate gene prediction of ms40
Candidate gene predictions and functional annotations were obtained from the Gramene database (http:// ensem bl. grame ne. org/). The conserved domains of candidate genes were predicted by the NCBI Conserved Domain Search tool (https:// www. ncbi. nlm. nih. gov/ Struc ture/ cdd/ wrpsb. cgi), and the data of the expression patterns were derived from an RNA-seq expression database (https:// www. maize gdb. org/). The sequences of the candidate genes were amplified from ms40 and RP125, and then PCR products were sequenced and analysed.
Based on the sequence difference of ZmbHLH51 between ms40 and RP125, an SNP marker was developed according to the flanking sequence of the mutation site (SNP-F: 5'-TGT CAT TGT ACG TAC GGC GG-3', SNP-R: 5'-CGT GGG ATG TAC GGC GAT G-3'). Co-segregation analysis of phenotypes and genotypes was implemented with the (ms40 × Mo17) F 2 and (ms40 × RP125)BC 1 F 1 populations. Moreover, thirty maize inbred lines were used for the analysis of sequence conservation around the mutation site within ms40.

RNA extraction and qRT-PCR
Total RNA of roots, stems, leaves and anthers, was extracted from the RP125 and ms40 plants using TRIzol reagent (Invitrogen, USA). Each sample contained three biological replicates. Total RNA was reverse transcribed using the Reverse Transcription Kit (Vazyme, China), and qPCR was performed using SYBR Green PCR Master Mix (TaKaRa, Japan). The Real-Time system CFX96 (Bio-Rad, USA) was used to detect the relative expression of genes. Three biological replicates and four technical replicates were performed for each procedure. ZmActin was used as the internal control to normalize the expression data (Chen et al. 2017). Relative expression levels were calculated according to the 2 −△△Ct method, and all results were showed as the mean ± standard error of the mean. All information of qRT-PCR primers is listed in Table S1.

Rapid amplification of cDNA ends (RACE) assay
Total RNA was extracted from the RP125 anthers at diverse developmental stages. The N711 Kit (Vazyme, China) was used for the RACE assay according to the manufacturer's instructions. GSP1 and GSP2 primers and the gene-specific primers used for RACE were designed according to the reference cDNA sequence of ZmbHLH51. The GC content of GSP1 and GSP2 primers was required to be 50-70%, and its T m was approximately 60 °C (GSP1: 5'-ACC TGC CTC CAT CAA TCC AGC TCG -3', GSP2: 5'-AAT GAG GTG GCA GTG CAG GCGGA-3'). GSP1 and GSP2 primers were used for 5' RACE and 3' RACE, respectively. The PCR products were cloned into the pEASY-Blunt cloning vector and sequenced. The CDS of ZmbHLH51 was forecasted via the ORF finder tool (https:// www. ncbi. nlm. nih. gov/ orffi nder/).

Protein activity assays
The CDS sequence of ZmbHLH51 was inserted into the pGBKT7 vector using Vazyme ClonExpress II One Step Cloning Kit (Vazyme, China). Then the recombinant strain was transformed into AH109 yeast strains (Tiandz, China) via a lithium acetate-mediated approach. The growth performances of cotransformants of positive clones were examined on SD/-Trp medium and SD/-His-Trp medium containing 50 mg l −1 χ-α-gal (Coolaber, China) for 2-4 days at 28 °C. The free pGBKT7 vector and pGBKT7-GAL4 AD were used as positive and negative controls, respectively.

Phylogenetic analysis
For determination of the evolutionary relationship between Zm00001d05395 and its homologs in other species, homolog sequences were searched in the NCBI database (https:// www. ncbi. nlm. nih. gov/) using the ZmbHLH51 amino acid sequence as the query, and 14 homologs from Oryza sativa, Solanum lycopersicum, Brachypodium, Foxtail millet, Hordeum vulgare, A. thaliana, Solanum tuberosum, Sorghum and Triticum aestivum were retrieved. Multiple sequence alignment was performed using CLUSTALW with default settings within MEGA 6 (Higgins 1996). We used MEGA 6 to construct an unrooted phylogenetic tree via the neighbor-joining method, which was tested with 1000 bootstrap replicates, and the phylogenetic tree was visualized using EvolView (https:// www. evolg enius. info/ evolv iew/) (Zhang et al. 2012).

Co-expression analysis
Co-expression analysis was performed to identify potential interacting proteins of ZmbHLH51. The expression data of approximately 40,000 maize genes from 8 tissues of B73 were downloaded from the q-teller database (http:// www. qtell er. com), and the gene expression data were fragments per kilobase of exon per million fragments mapped (FPKM). Pearson correlation coefficient values (PCC) of each gene with ZmbHLH51 were calculated based on expression data. The genes with PCC > 0.8 and P-values < 0.05 considered to be co-expressed genes. The FPKM values of co-expressed genes were homogenized by log 2 (FPKM + 1), and then, the Z-scores were calculated (Sekhon et al. 2011). A Z-score value larger than 2 was determined to represent a tissuespecific gene. For characterization of the putative function of ZmbHLH51 co-expressed genes, GO terms for each coexpressed gene were obtained at Gramene (http:// www. grame ne. org/), and GO enrichment analysis was performed using OmicShare tools (http:// www. omics hare. com/ tools).

The male sterility of mutant ms40 is controlled by a recessive nuclear gene
The pollen of maize inbred line RP125 was treated with EMS, and M 1 plants were self-pollinated to obtain M 2 seeds. The male-sterile mutant ms40 was found among the M 2 generation. Next, RP125 pollen was used to fertilize ms40, all the individuals of the (ms40 × RP125)F 1 population were fertile and then self-pollination were conducted. The male-sterile phenotype separated within the F 2 population. Hence, some male-sterile plants were pollinated with B73, Mo17 separately, all the individuals of the (ms40 × B73)F 1 and (ms40 × Mo17)F 1 populations presented fertile plants. Then, self-pollination was conducted for (ms40 × B73)F 1 and (ms40 × Mo17)F 1 individuals. The male-sterile phenotype was separated in the (ms40 × B73)F 2 and (ms40 × Mo17)F 2 populations regardless of whether seeds were planted in Sichuan or Yunnan. Moreover, the segregation ratio of fertile plants vs male-sterile plants within both (ms40 × B73)F 2 and (ms40 × Mo17)F 2 populations fitted 3:1 by student t-test (Table 1). These results proved that the sterile phenotype of ms40 was controlled by a single recessive nuclear gene. There were no remarkable differences between RP125 and ms40 in their agronomic traits (Fig. 1a, b). After tasseling, the RP125 anthers exhibited normally exserted and scattered, whereas ms40 tassels failed to expose anthers and with no pollen shed, and ms40 anthers were smaller and thinner than those of RP125 ( Fig. 1c-f). Employing I 2 -KI staining, the pollen grains of RP125 were dark bluestained with round shapes, while no pollen grains were found for ms40 (Fig. 1g-h).

ms40 anthers appear to be a certain imperfection
To reveal the characteristics of ms40 anthers, we examined the epidermis and inner surface of anthers by SEM. The ms40 anthers were much smaller and shorter (Fig. 2b) than that of RP125 (Fig. 2a). In addition, the RP125 anthers showed a latticed-waxy-crystal anther epidermal surface ( Fig. 2c), but ms40 was smooth without any cuticle (Fig. 2d). Moreover, we found that Ubisch bodies covered the whole inner surface of RP125 anthers (Fig. 2e), while no  Ubisch bodies were observed on the inner surface of ms40 anthers (Fig. 2f). From the broken anthers, we found that pollen grains with round shapes filled in the RP125 anthers (Fig. 2g, i), but no pollen grains were discovered within ms40 anthers (Fig. 2h), which further verified that ms40 was a no-pollen type male-sterile mutant. These results showed that a series of developmental defects of the anther and pollen appeared on ms40.

ms40 exhibits delayed degradation of anther tapetum
A variety of anther dysplasia phenotypes were observed in various male-sterile individuals. Understanding the cytological characteristics of pollen abortion helps explain the mechanism of failure for a male-sterile mutant. Therefore, anthers of male-sterile and fertile plants at the different stages were examined using semi-thin section. At the sporogenous and pollen mother cell stages, no substantial difference was observed between the anthers of RP125 and ms40 (Fig. 3 a,  d). At the meiosis stage, tapetum began degradation with a paliform shape in RP125, while the ms40 tapetum remained almost intact, suggesting that the degradation of ms40 was delayed (Fig. 3 e, f). Subsequently, the content of tapetum cells in RP125 began to condense and showed deepened staining, while the ms40 tapetum seriously swelled, and irregular microspores were observed (Fig. 3 g, h). At the large vacuolated stage, the vacuoles had formed in the center of microspores, and tapetum cells were further concentrated and degraded in RP125. However, the microspores of ms40 began to shrink and were unable to form large vacuoles and tapetum cells were clearly visible with no signs of disintegration (Fig. 3 i, j). At the binucleate stage, the vacuolated microspores underwent asymmetric mitotic division and displayed a falcate-shape, accompanied by complete tapetum disintegration in RP125, while the microspores of ms40 gradually degraded, and the vacuolation of the tapetum was more obvious (Fig. 3 k, l). At the mature pollen grain stage, vast pollen grains filled with starch were observed in the anther locule of RP125. Conversely, the microspores of ms40 were almost completely degraded, leaving only remnants in their locules, and vacuolized tapetum could be observed (Fig. 3 m, n). At the same time, we performed TUNEL assays for the anthers of RP125 and ms40. At the tetrad stage, TUNEL signals were detected in the RP125 tapetum, but no signals were found in ms40 (Fig. 3 o, p). At the uninucleate stage, the signals in RP125 reached a peak, but in the ms40, the signals began to appear in the tapetal cells. (Fig. 3 q, r). At the binucleate stage, the signals could not be detected in the RP125. However, the signals can still be tested in the expanded tapetal cells in the ms40 (Fig. 3 s, t). In conclusion, these results showed that the PCD of ms40 tapetum was delayed, which suggested that the delay of PCD resulted in abnormal anther development in ms40.

Zm00001d053895 is the key candidate gene of ms40
In this study, (ms40 × B73)F 2 was taken as a mapping population, and 134 InDel markers covering the whole maize genome were developed based on the whole genome resequencing data of RP125 and B73. The 134 InDel markers were used for polymorphism scanning between ms40 and B73, and 73 polymorphic markers were obtained. Afterwards, the 73 polymorphic markers were applied to scan polymorphisms between the male-sterile DNA pool and fertile DNA pool. Then, InDel markers umc1940 and umc1649 were selected, and both umc1940 and umc1649 were located at bin 4.10 on maize chromosome 4. Therefore, we developed servals novel InDel markers between ms40 and B73, among them, six polymorphic markers were obtained. Then, 115 male-sterile individuals from the (ms40 × B73)F 2 population were genotyped with the six InDel markers. As a result, using InDel markers X98 and X72 separately detected 1 and 7 recombinants from the 115 male-sterile individuals, and no recombinant was detected with InDel marker X76 among the 115 male-sterile individuals. Therefore, ms40 locus was located between X98 and X72 on chromosome 4 (Fig. 4a).
To narrow the mapping region, 1230 male-sterile individuals derived from the (ms40 × B73)F 2 population were used for genotyping with X98 and X72, and then 11 and 6 recombinants were screened for X98 and X72, respectively. The following four InDel markers within the mapping interval between X98 and X72 were developed, among these InDel markers, InDel markers X214 and X242 separately capture one recombinant from the 1230 male-sterile individuals. However, no recombinant plant was detected among the 1230 male-sterile individuals for InDel marker X168. Therefore, the male-sterile gene of ms40 was mapped to chromosome 4 between X214 and X242, and the physical distance between X214 and X242 was 282 kb (Fig. 4b). According to the MaizeGDB database, a total of 5 ORFs were identified within this region (Table 2). Then 5 ORFs were amplified from RP125 and ms40, and only Zm00001d053895 harbored an SNP (G to A) at position 2851 between ms40 and RP125, no sequence differences were found in DNA or cDNA for the other four genes. The SNP locus was located at the seventh exon of Zm00001d053895, which led to the amino acid change from Gly (GGG) to Arg (AGG) (Fig. 4c). Zm00001d053895 was predicted to be an anther-specific gene that encodes a bHLH transcription factor ZmbHLH51 (https:// www. maize gdb. org/ gene_ center/ gene/ Zm000 01d05 3895), its homologous genes, Arabidopsis AtAMS gene and rice OsTDR1 gene, both had been reported to be involved in tapetum and pollen development (Ferguson et al. 2017;Li et al. 2006). At the same time, the expression levels of five candidate genes in ms40 and RP125 were detected by qRP-PCR, the expression of Zm000001d053896 was not detected in anthers, the expression of ZmbHLH51 both in ms40 and RP125 can be detected but without notable difference at the pollen mother cell, and significantly lower in ms40 than that of RP125 at the tetrad stage. No significant expression differences were found between ms40 and RP125 in anthers for candidate genes Zm000001d053890, Zm000001d053891, and Zm000001d053894 (Fig. S1).  (Fig. 4d). All these results showed that the haplotype (A/A) at position 2851 co-segregated with the male-sterile phenotype of ms40. Moreover, the base of SNP site within the thirty maize inbred lines was examined using the SNP marker, and only ms40 had the homozygous A/A allele at position 2851 of ZmbHLH51, the thirty inbred lines were all homozygous G/G for the corresponding locus of ZmbHLH51 (Table S3), Fig. 4 Map-based cloning of ms40 and mutation site analysis of key candidate genes. a Primary mapping of ms40 using the (ms40 × B73) F 2 population including 115 male-sterile individuals. b Fine mapping of ms40 using the (ms40 × B73)F 2 population including 1230 male-sterile individuals. ms40 was mapped to an interval of approximately 282 kb flanked by the InDel markers X214 and X242. c The gene structure and sequence alignment of ZmbHLH51 between fertile plants (RP125-1/-2/-3) and sterile plants (ms40-1/-2/-3). Black boxes and black bent lines represent exons and introns, respectively. The red arrow indicates the mutation site, which was located in exon 7 of ZmbHLH51 and resulted in a change in the amino acid from Gly (GGG) to Arg (AGG). d Sequencing diagrams of the mutation site between the fertile and sterile plants, G/G and G/A represent fertile plants, A/A represents sterile plants, and the blue box represents sequencing diagrams of the mutation site (color figure online) suggesting that the 2851st nucleotide (G) should be a conserved nucleotide and that position 2851 of ZmbHLH51 may be a key functional site.

Zm00001d053895 encodes a bHLH transcription factor
To illustrate the evolutionary relationship of ZmbHLH51, we performed a phylogenetic analysis based on 14 orthologous genes from 10 plant species that shared high sequence similarity with ZmbHLH51. Through multiple sequence alignment, a classic HLH domain was found in all 14 homologous genes (Fig. 5a), suggesting that ZmbHLH51 is a typical bHLH transcription factor and that the orthologs of ZmbHLH51 might have conserved functions among various plant species. Moreover, the mutation site of ZmbHLH51 in ms40 was located within the HLH conserved domain of the bHLH transcription factor (Fig. 5b). Phylogenetic analysis showed that these genes were divided into three clades, which indicates that their molecular functions had a degree of evolutionary conservation. ZmbHLH51 shared the highest homology with Sb04g001650 (81.2%) of Sorghum bicolor (Fig. 5c), a putative TDR bHLH transcription factor connected with the development of anther tapetum, thus, it can be seen that ZmbHLH51 plays essential roles in regulating tapetum development.

ZmbHLH51 is specifically expressed in maize anthers and had four different transcripts
To decipher the expression specificity of ZmbHLH51, qRT-PCR was performed for roots, stems, leaves and anthers from different developmental stages of RP125 with the specific primers q-51-F/R. ZmbHLH51 was preferentially expressed at the pollen mother cell and tetrad stages of anthers, and the expression level was low at the uninucleate and binucleate stages. These results were consistent with the cytological phenotypic differences observed at the meiotic stages between RP125 and ms40. In contrast to the expression levels of anthers, the expression levels in roots, stems and leaves were quite low (Fig. 6a), which suggested that Zmb-HLH51 was an anther-specific gene and probably played a key role in the anther development.
In our study, a rapid amplification of cDNA ends (RACE) assay was performed to determine the structure of Zm00001d05389 transcripts, and four transcripts of ZmbHLH51 were amplified with total RNA from RP125 anthers at different developmental stages. The CDS was predicted according to the sequences of transcripts by the ORF finder tool (https:// www. ncbi. nlm. nih. gov/ orffi nder/), and four transcripts were identified, they encoded 560 aa (ZmbHLH51-T001), 597 aa (ZmbHLH51-T002), 603 aa (ZmbHLH51-T003) and 628 aa (ZmbHLH51-T004) The motifs of bHLH proteins are identified by MEME. The red arrow indicates the amino acid residues corresponding to the mutation site. c Phylogenetic tree of ZmbHLH51 and its homologs in other plant species. The percentage identities represent the sequence similarity between the corresponding proteins and ZmbHLH51. The yellow star represents Solanum tuberosum, the red circle represents Solanum lycopersicum, the blue star represents Arabidopsis thaliana, the black star represents Glycine max, the green circle represents Oryza sativa, the black rectangle represents Foxtail millet, the red star represents Zea mays, the red triangle represents Sorghum bicolor, the blue rectangle represents Brachypodium distachyon, the red rectangle represents Hordeum vulgare, and the green triangle represents Triticum aestivum (color figure online) proteins (Fig. 6b). Among the four transcripts, Zmb-HLH51-T001, ZmbHLH51-T003 and ZmbHLH51-T004 contained 7 exons and 6 introns, and only ZmbHLH51-T002 contained 8 exons and 7 introns. Through comparison with the shortest transcript ZmbHLH51-T001, we found that both ZmbHLH51-T003 and ZmbHLH51-T004 transcripts resulted from alternative 5` splice sites (A5SSs), while transcript ZmbHLH51-T002 resulted from exon skipping (ES). By sequence alignment of the four transcripts, we found that the stop codon positions of the four transcripts were same, but the positions of the start codons were different, which accounted for the diversity in four transcripts based on the difference in transcription start sites (TSSs). In short, we have found four novel transcripts of ZmbHLH51 which was helpful to further explore its biological function.

ZmbHLH51 is localized in the nucleus and its protein has the transcriptional activating ability
Firstly, the TargetP-2.0 server was used to forecast the putative subcellular location of ZmbHLH51, and then we performed transient expression assays in tobacco leaves. The vector p35S::ZmbHLH51-eGFP was constructed with the ZmbHLH51 coding sequence and fused to the N-terminus of eGFP driven by 35S. As expected, the nuclear localization signal (NLS) of vector p35S::NLS-RFP was distributed in the nucleus of tobacco mesophyll cells, the eGFP signal of vector p35S-eGFP was distributed throughout the entire cell, and the fluorescence signal merged in the nucleus when vector p35S::NLS-RFP and vector p35S-eGFP were cotransformed. When p35S::ZmbHLH51-eGFP and p35S::NLS-RFP were cotransformed, the fluorescence signals merged in the nucleus (Fig. 7a). All the results suggested that ZmbHLH51 was localized in the nucleus, which is consistent with the location result predicted by the TargetP-2.0 server tool.
The transactivation activity assay was carried out to investigate the transcriptional activating ability of Zmb-HLH51, the transformant pGBKT7-ZmbHLH51 was constructed. All of them grew well on the SD/-Trp medium. However, on the SD/-His-Trp medium containing 50 mg/l χ-α-gal, the free pGBKT7 transformant could not live, but the pGBKT7-GAL4 AD and pGBKT7-ZmbHLH51 grew normally and turned the indicator blue (Fig. 7b), which indicated that ZmbHLH51 had the transcriptional activating ability.

ZmbHLH51 co-expressed with some anther-specific genes
We identified 1192 genes co-expressed with ZmbHLH51 in the whole genome using the expression data of maize genes downloaded from the q-teller database. Among them, the male-sterile gene Zm00001d02680 (ms7) , and five putative GMS genes (Wan et al. 2019), Zm0001d031312, Zm00001d033335, Zm00001d013732, Zm00001d013991, and Zm00001d035841, shared expression PCC values of 0.91, 0.98, 0.95, 0.97, 0.88 and 0.92 with ZmbHLH51, respectively. Such a high correlation suggested that ZmbHLH51 may be linked to male fertility.
Next, the GO terms of 1192 co-expressed genes were analysed. For the biological process category, metabolic process, the cellular process and single-organism process were highly enriched in the GO classes. Of the cellular components, 107 GO terms were enriched, mostly in the membrane and membrane parts. For the molecular function category, binding and catalytic activity were the most abundant subcategories (Fig. 8a), and these terms have been reported to be related to alterations in male fertility (Mei et al. 2016;Qu et al. 2015;Zhu et al. 2015). After homogenization of the expression data of co-expressed genes, we found that 647 genes were specifically expressed in anthers or pollen (Fig. 8b). These results indicated that ZmbHLH51 was related to the anther development in all probabilities.

The expression of several genes related to anther development was affected in ms40
The cytological observation showed that the anther development in ms40 was abnormal. So, the expression of several known male sterile genes and anther development-related genes in maize were investigated by qRT-PCR. As a result, their expression was affected in ms40 compared with RP125. At the pollen mother cell and uninucleate stages, no significant expression differences for ZmbHLH51 were found between RP125 and ms40. However, at the tetrad stage, the expression of ZmbHLH51 in ms40 was relatively lower compared with RP125 (Fig. 9a). Besides, at the tetrad stage, the expression levels of Ms7, Ms9, MYB80 and bHLH122 were greatly reduced in ms40. At the uninucleate stage, the expression levels of Ms45, Ms7, MYB80 and bHLH122 were dramatically down-regulated in ms40 (Fig. 9b-f). These results showed that ZmbHLH51 may alter the expression pattern of anther development-related genes.

ZmbHLH51 probably is the sterile gene of ms40
The male-sterile mutant ms40 was derived from the progeny of EMS-treated inbred line RP125. The ms40 exhibited no anther exertion and belonged to the no-pollen malesterile type, and showed stable male sterility growing in different locations. Genetic analysis indicated that ms40 was controlled by a recessive nuclear gene. Through mapbased cloning, the male-sterile gene of ms40 was successfully located on the long arm of chromosome 4 within a 282-kb region containing five ORFs. Based on cloning and sequencing analysis, merely ZmbHLH51 had an SNP from G to A within its seventh exon, which encodes a bHLH transcription factor (https:// www. maize gdb. org/ gene center/ gene/Zm00001d053895). Its homologous genes OsTDR1 and AtAMS have been reported to be related to anther development (Ferguson et al. 2017;Li et al. 2006). In addition, we explored the expression difference of five candidate genes between RP125 and ms40, the results of qRP-PCR showed that only ZmbHLH51 had expression difference at the tetrad stage of anther, but there was no significant difference for the other genes. Therefore, ZmbHLH51 is considered to be a key candidate gene of ms40.
Understanding the abnormal cytological characteristics of anther development is necessary for exploring the abortion mechanism of a male-sterile mutant. The tapetum located in the innermost layer of the anther is attached to the development of microspores, and normal development of the tapetum is vital to the formation of pollen. Cytological observation showed that the ms40 anther tapetum exhibited obvious vacuolization and delayed degradation (Fig. 3a-n). The abortive features of ms40 are consistent with the abortion characteristics of most bHLH transcription factor mutants in various species, such as Atasm, Ostdr1, Osudt1, Oseat1, Zmms23 and Zmms32 (Ferguson et al. 2017;Jung et al. 2005;Ko et al. 2014;Moon et al. 2013;Nan et al. 2017;Niu et al. 2013), all these mutants manifested different degrees of abnormality in tapetum. Expression pattern analysis of ZmbHLH51 showed that it was specifically expressed in anthers, and preferentially expressed at the pollen mother cell stage and tetrad stage of anthers, but hardly expressed in the vegetative tissues (Fig. 6a). Consistently, the cytological differences in anthers between RP125 and ms40 occurred at the meiotic stages. Thus, the cytological observation results of ms40 also supported that ZmbHLH51 was the key candidate gene of ms40.

The SNP in ZmbHLH51 between RP125 and ms40 is located a key functional site
Multiple sequence alignment showed that ZmbHLH51 has the classical HLH domain of bHLH transcription factors (Fig. 5a), the HLH domain is necessary for bHLH transcription factors to form homodimers or heterodimers and then regulate the expression of target genes (Elena et al. 2013), and the bHLH-bHLH complexes have been confirmed to be relevant to plant fertility (Ko et al. 2014). Coincidentally, using NCBI-CDD tool, we found that the SNP site in ZmbHLH51 between RP125 and ms40 was predicted to be located at the binding sites of HLH domain. In addition, the 2851st nucleotide (G) in ZmbHLH51 is conservative in different species (Fig. 5b), and this conservatism also existence in thirty different maize inbred lines (Tab. S3). By the co-segregation analysis in (ms40 × B73) F 2 and (ms40 × Mo17)F 2 populations with the intragenic SNP marker, we found that all the male-sterile individuals were A/A genotype, all the fertile individuals were G/G or G/A genotype (Fig. 4d). These results indicate the G to A was responsible for the generation of ms40 sterile phenotype, and the SNP site in ZmbHLH51 is necessary to the function of ZmbHLH51 in anther development. indicates the relative gene expression levels. These gene expression data were retrieved from the qTeller database and log 2 -normalized (original data + 1)

ZmbHLH51 plays an important role in anther development
In our study, we found 1192 genes are co-expressed with ZmbHLH51. Among them, Zm00001d02680(Ms7), Zm00001d031312, Zm00001d033335, Zm00001d013732, Zm00001d013991, and Zm00001d035841 are associated with the anther development had been reported (Wan et al. 2019). Meanwhile, we used qRT-PCR to explore whether the expression of some genes related to anther development in ms40 will be affected. The results showed that the expression levels of these gene in ms40 were significantly lower than that in RP125 at the tetrad or uninucleate stage (Fig. 9), speculating that these genes maybe interact with ZmbHLH51 in the regulation pathway of anther development. In Arabidopsis, the fertility regulation pathway of AMS-MS188-MS1 had been placed, their homologous gene in maize are ZmbHLH51, MYB80 and Ms7, respectively. Interestingly, in our study, we found that MYB80 and Ms7 were significantly differentially expressed in ms40 and RP125, and coexpressed analysis had shown that Ms7 was co-expressed with ZmbHLH51. Therefore, we speculate that the regulation pathway of ZmbHLH51-MYB80-Ms7 may exist in maize and ZmbHLH51 plays a key role in anther development.

Prospect of ms40 in maize hybrid seed production
It is well known that maize male-sterile mutants play an important role in hybrid seed production. Although various male sterile genes have been cloned, there is a certain distance from discovering a male-sterile mutant to applying it in seed production for a particular cross combination. Hence, the high yield and high combining ability of the excellent inbred line RP125 were selected as the basic materials for EMS treatment. Additionally, other studies showed that RP125 is a phosphorus efficient maize inbred line with the characteristic to resist a variety of diseases. Fortunately, the male-sterile mutant ms40 was obtained from their progeny, thus we can utilize ms40 for the hybrid seed production of some cross combinations using RP125 as the female parent. Based on the different years and locations of experiments, no obvious difference was found between RP125 and ms40 in terms of their agronomic traits, this is also conducive to the better application of ms40 in hybrids production. Collectively, exploring the abortion mechanism of ms40 can not only contribute to the research of anther development in maize but also substantially promote the application of using male sterile lines for hybrid seed production.