Identification of QTLs and a putative candidate gene involved in rhizome enlargement of Asian lotus (Nelumbo nucifera)

QTL mapping studies identified three reliable QTLs of rhizome enlargement in lotus. NnBEL6 located within the confidence interval of the major QTL cqREI-LG2 is a key candidate gene enhancing rhizome enlargement. Lotus (Nelumbo) is perennial aquatic plant with nutritional, pharmacological, and ornamental significance. Rhizome is an underground lotus stem that acts as a storage organ and as a reproductive tissue for asexual production. The enlargement of lotus rhizome is an important adaptive strategy for surviving the cold winter. The aims of this study were to identify quantitative trait loci (QTLs) for rhizome enlargement traits including rhizome enlargement index (REI) and number of enlarged rhizome (NER), and to uncover their associated candidate genes. A high-density genetic linkage map was constructed, consisting of 2935 markers binned from 236,840 SNPs. A total of 14 significant QTLs were detected for REI and NER, which explained 6.7–22.3% of trait variance. Three QTL regions were repeatedly identified in at least 2 years, and a major QTL, designated cqREI-LG2, with a rhizome-enlargement effect and about 20% of the phenotypic contribution was identified across the 3 climatic years. A candidate NnBEL6 gene located within the confidence interval of cqREI-LG2 was considered to be putatively involved in lotus rhizome enlargement. The expression of NnBEL6 was exclusively induced by rhizome swelling. Sequence comparison of NnBEL6 among lotus cultivars revealed a functional Indel site in its promoter that likely initiates the rhizome enlargement process. Transgenic potato assay was used to confirm the role of NnBEL6 in inducing tuberization. The successful identification QTLs and functional validation of NnBEL6 gene reported in this study will enrich our knowledge on the genetic basis of rhizome enlargement in lotus.


Introduction
Lotus (Nelumbo Adans.) is a basal eudicot that belongs to the family Nelumbonaceae, which comprises two extant species including N. nucifera Gaertn. and N. lutea (Willd.) Pers. Nelumbo nucifera is distributed throughout Asia and Northern Australia, whereas N. lutea is found in North America and the northern areas of South America (Yang et al. 2012;Zhang et al. 2011). Asian lotus have been cultivated for over 7000 years for their ornamental, edible and medicinal properties in Asian countries, especially in China, and are classified into three categories: flower, seed, and rhizome lotus, according to their use (Ming et al. 2013;Shen-Miller 2002).
Lotus rhizome is a storage organ derived from modified subterraneous stem, and acts not only as an important nutritional vegetable, but it is also reproductive tissue for lotus asexual production (Man et al. 2012). Therefore, the rhizome development studies have attracted increasing attention from the researchers. Rhizome is generated through a complex developmental process that begins with the formation of a longitudinal underground stolon, which later swells to enlarge the girth. This process can be classified into four stages, including stolon stage, initial swelling, middle swelling, and late stage of expansion.

3
Important carbohydrates, such as starch is rapidly synthesized and accumulated in the rhizome during the last two stages of development (Cheng et al. 2013;Yang et al. 2015). Rhizome girth enlargement occurs under short day (SD), and cool temperature promotes dry weight (starch) accumulation at the late stage of development (Anderson 1970;Masuda et al. 2007a, b). Rhizome transition from longitudinal elongation to the girth enlargement is regulated via the biosynthesis of GA and/or ABA (Masuda et al. 2010(Masuda et al. , 2012. Transcriptomic studies reveal that the genes related to phytohormone, photoperiod, and starch synthesis are involved in this transition (Cheng et al. 2013;Yang et al. 2015). The light and auxin signal can be transduced through Ca 2+ which serves as a secondary messenger . To date, there is still insufficient clarification on the complex process of rhizome enlargement in lotus.
The enlargement of lotus rhizome ensures winter survival and sprouting in the incoming spring season. Temperate and tropical lotus are the two ecotypes of N. nucifera, and each has different rhizome development process and adaptability to cold conditions (Zhang and Wang 2006). The underground stems of temperate ecotype show longitudinal growth in the early stage with increased girth at the late stages, and undergo all four developmental stages mentioned above. Tropical lotus ecotypes have no clear girth enlargement in their entire growth period, and are unable to survive the cold winter in subtropical and temperate regions that have freezing season (Yang et al. , 2015. Thus, other than its role as a vital asexual production tissue, the enlargement of lotus rhizome acts as an adaptive strategy for surviving through the cold winter. Developments of storage organs have been extensively studied in potato (Solanum tuberosum) and onion (Allium cepa) (Abelenda et al. 2014;Lee et al. 2013;Navarro et al. 2015). Tuberization in potato is regulated by environmental factors and a series of genes, and it normally occurs under SDs. StSP6A is a Flowering lotus T (FT) like protein that functions as tuberigen, and it is activated in the leaves only under SDs (Navarro et al. 2011). The coding protein is transported through the phloem to the stolon, where it interacts with St14-3-3s and StFDL1 to promote tuberization (Teo et al. 2017). StBEL5 is a BEL1-like homeobox gene, and its RNA functions as a phloem-mobile RNA signal that activates the expression of StSP6A in the stolon (Sharma et al. 2016). In onion, bulb formation is also regulated by the antagonistic effect of two FT-like genes, AcFT1 and AcFT4. Bulb formation is promoted by AcFT1, while AcFT4 prevents the upregulation of AcFT1, thus inhibiting bulbing. Long-day photoperiod promotes bulbing by upregulating and downregulating the expression levels of AcFT1 and AcFT4, respectively. It is predicted that AcFT1 interact with 14-3-3, but AcFT4 completely disrupt the interface with 14-3-3 (Lee et al. 2013;Lyngkhoi et al. 2019;Manoharan et al. 2016). These studies provide important cues for studying rhizome formation in lotus.
Lotus germplasm show notable variation in the phenotypes of storage rhizome, including difference in branching, elongation and expansion (Lin et al. 2019), which provides vital materials for studying the genetic mechanisms for rhizome enlargement. Quantitative trait locus (QTL) analysis is a powerful genetic approach for dissecting the complex process/trait and investigating the genetic variation within species. Recently, QTLs of rhizome yield-related traits were detected, and candidate genes were predicted in lotus (Huang et al. 2021). In order to understand the genetic control of rhizome enlargement in lotus, the two lotus ecotypes were used in this study to generate an F 2 population and to construct an SNP-based genetic linkage map. QTLs mapping for rhizome enlargement were analyzed using a 3-year phenotype data, and stable QTLs across the years were identified. Furthermore, a candidate NnBEL6 gene that putatively controls rhizome enlargement was predicted, and functionally validated in transgenic potato. Our results elucidate the genetic basis of rhizome formation and provide a foundation for future breeding of new cultivars by marker-assisted selection.

Plant materials and phenotyping
An F 2 population consisting of 138 individuals was used to construct genetic linkage map and to detect QTLs. Two parents, N. nucifera 'Baige' ('BG', female parent, temperate lotus) and N. nucifera 'Winter Red 1' ('WR1', male parent, tropical lotus) with different rhizome girth enlargements were used (Fig. 1). The mapping population was separately planted in a trial plot at Wuhan Botanical Garden of the Chinese Academy of Sciences (N30° 32′ 44.02″, E114° 24′ 52.18″), Hubei Province, China, in April, between 2014 and 2016. Each individual was planted in a separate 60-cm diameter and 45-cm depth pots in order to avoid mixing of the rhizomes. Paddy soil was added in each pot to a 20-cmdepth, and 200 g organic fertilizer (mature soybean seedcake) was applied before planting. While 20 g chemical fertilizer was top dressed at the early developing stage of standing leaf, the onset of flowering, full-blooming, and the late stage of flowering. The depth of water in the pot was kept at a 20-25 cm throughout the whole growth season.
The three phenotypic traits including number of enlarged rhizome (NER), maximum diameter of enlarged internode (MDE) and length of enlarged internode (LER) were investigated in each accession in three biological replicates. Rhizome enlargement Index (REI) was calculated as maximum diameter/length of enlarged internode according to Masuda et al. (2007b). Statistical significance of the traits was assessed using ANOVA procedure in the SAS 8.1 system (SAS Institute, Cary, NC, USA).

Resequencing analysis and genotyping
Young leaves of the mapping population were collected for DNA extraction using the cetyltrimethylammonium bromide (CTAB) method described by Doyle and Doyle (1990), and slightly modified with the addition of RNase A and proteinase K treatment to prevent RNA and protein contamination. DNA concentration was checked using a NanoDrop 2000 (Thermo) spectrophotometry and quality was analyzed with electrophoresis on an 0.8% agarose gels with a lambda DNA standard, then stored at − 20 °C until use.
Genomic DNA was fragmented to 300-500 bp by sonication for library construction and Illumina resequencing following the manufacturer's standard instructions. Quantification and quality assessment of library were carried out on an Agilent DNA 1000 LabChip analyzer (Agilent Technology 2100 Bioanalyzer). Paired-end sequencing was conducted on an Illumina HiSeq 2000 platform. Low quality (Q score < 20), contaminant sequences were filtered using the NGS QC Toolkit (http:// www. nipgr. res. in/ ngsqc toolk it. html). The clean data were then mapped to 'China Antique' masked genome using the BWA program. Only sequences with less than two mismatches were aligned to the reference genome. The SAM output file was converted to BAM format and sorted using the SAM tools suite. These high-quality reads were used for SNP calling using Perl scripts (http:// www. genom ics. ceh. ac. uk/ msatfi nder). SNPs of low quality or with clustered SNP sites were removed from the genotype data, and a missing value was used to replace genotypes occupying lower reliability (< 20).
All SNPs markers in each individual were scored as aaxbb, nnxnp, lmxll, and hkxhk. For aaxbb marker, the genotype of homozygous individual with similar genotype to the female or male parent was labeled as "a" or "b", the genotype of the heterozygous individual was labeled as "h". For nnxnp marker, the genotype of homozygous individual with similar genotype to female parent was labeled as "nn", while the genotype of heterozygous individual was labeled as "np". For lmxll marker, the genotype of homozygous individual with similar genotype to the male parent was labeled as "ll", while the genotype of heterozygous individual was labeled as "lm". For hkxhk marker, the genotype of individual 1, 2 and 3 were labeled as "hh", "kk" and "hk", respectively.

Construction of genetic linkage map
Firstly, all SNPs with > 70% missing data (more than 95 F 2 individuals) and with allelic distortion (the ratio between minor allele and major allele frequency > 0.2)  (2014, 2015, and 2016). REI was estimated as the ratio of maximum diameter to length of enlarged internode, according to the method described by Masuda et al. (2007b). Error bars show standard error of the mean of three biological replicates. Different lowercases indicate significant differences (P < 0.05) based on Least Significant Difference (LSD) test, one-way ANOVA were discarded. Secondly, missing data was imputed in the filtered SNPs to create bins using an in-house Perl script. Four steps were included in these scripts: (1) Based on the position of SNP markers on reference genome, the SNPs in every 30 kb interval were assigned into one bin using the majority rule; (2) the missing genotypes were impute using the majority rule; (3) the mis-scored genotypes were corrected with the following criteria: when the marker was different from the adjacent ten markers (five markers above or below) in the same genotype, it was considered as a mis-scored genotype, and then corrected to the same genotype; (4) the markers with 100% identical genotype across all individuals were merged to one bin. Thirdly, the bins of aaxbb, lmxll and nnxnp were merged to construct genetic linkage map using JoinMap 4.0 software (Van Ooijen 2006). The regression mapping algorithm was used for map construction. The goodness of fit threshold was set to ≤ 5.0 with logarithm of the odds ratio (LOD) scores > 1.0 and a combination frequency < 0.4. Recombination frequency was converted to linkage distances in centimorgan (cM) using the Kosambi mapping function (Van Ooijen 2006). Bins on the linkage map were aligned to the reference genome of lotus 'China Antique', based on their position (Ming et al. 2013).

QTL analysis and candidate gene prediction
QTLs were detected by the composite interval method (CIM) using WinQTL Cartographer 2.5 software (Wang et al. 2012). CIM was used to scan the genetic map and estimate the likelihood of a QTL and its corresponding effect at every 2 cM interval. For each trait, the threshold for detecting a significant QTL (P < 0.05) was estimated by 1000 permutations. The coincidental significant QTL for the same traits from the 3 climatic years were integrated into consensus QTL using the BioMercator 2.1 software (Arcade et al. 2004). For nomenclature of significant QTL, "q" was used to denote QTL, followed by an abbreviation of the trait (e.g., NER), then the 3 climatic years (2014, 2015 and 2016), the linkage group number (LG1-LG8), and finally arbitrary letter codes (a, b, c…). For nomenclature of consensus QTL, "cq" was used to denote consensus QTL, followed by an abbreviation of the trait, and then the linkage group number, finally arbitrary lower-case letter code.
Based on the alignment between the linkage map and the reference genome of lotus 'China Antique', the genes in the interval of consensus QTLs were identified by referring to the lotus genome annotation (Ming et al. 2013). The expression patterns of identified genes were then analyzed from transcriptomic data (Yang et al. , 2015.

Molecular cloning and sequence analysis of NnBEL6
To isolate full-length complementary DNA (cDNA) of NnBEL6, the youngest rhizome at the initial swelling stage was sampled from 'BG', 'WR1' and a 'ZO' lotus accession with REI of 0.46 (Fig. 1). Total RNA was prepared using TRIzol® reagent (Cat. no. 15596-026, Invitrogen) according to the manufacturer's instructions. First-strand cDNA was synthesized from 1 µg total RNA and used as a template for reverse transcriptase (RT)-PCR. The coding regions of the candidate genes were amplified from the cDNA using Pfu polymerase (TakaRa). The PCR products were purified and sequenced directly. The pair of primers used to amplify cDNA are listed in Table S1.
The genomic regions of up to 8485 bp from 1.5 kb upstream to the stop codon of NnBEL6 were sequenced in the two parental genotypes and 23 other lotus cultivars which were phenotypically divided into three classes based on REI as, class I, with REI < 0.1; class II, with 0.1 < REI < 0.4; and class III, with REI > 0.4. Of the tested lotus accessions, seven had class I rhizome, while class II rhizome and class III rhizome each had nine lotus members. The sequence comparison from these lotus accessions was performed in Bioedit software (http:// www. mbio. ncsu. edu/ bioed it).

Expression analysis of NnBEL6
The two cultivars, 'WR1' and 'ZO', were used to analyze the spatial and temporal expression of NnBEL6. The leaf, petiole, flower-bud, pedicel, rhizome and root samples were collected from three different, randomly selected plants at the stolon and initial swelling stages, and immediately frozen in liquid nitrogen and stored at − 80 °C prior to RNA extraction. Total RNA from each tissue sample was extracted for quantitative RT-PCR using TB Green Premix (www. takara-bio. com) on a Step One Plus Real-time PCR System (Applied Biosystems), following the manufacturers' instructions. Specific primer pairs sequences for NnBEL6, and the control actin gene used are listed in Table S1. All reactions were carried out in three independent biological replicates. The comparative critical threshold (2 −△△CT ) method was used to analyze the relative NnBEL6 gene expression level (Livak and Schmittgen 2001).

Potato transformation
The cDNA sequence of NnBEL6 gene from 'ZO' was first cloned into the pGEM-T Easy vector (Promega). KpnI/XbaIdigested fragments were then inserted in the intron-GUS of pCambia-2300-35S-EGFP vector, to ensure the expression of transgene is driven by the cauliflower mosaic virus CaMV35S promoter. These constructs were introduced into Agrobacterium tumefaciens strain Gv3101, which was then transformed into the 'E3' line, as previously described (Zhou et al. 2019). Transformants were confirmed by PCR amplifying a fragment of approximately 2100 bp with specific primer (Table S1). Six positive transgenic lines were screened, and three positive ones were chosen for in vitro tuberization test.
For in vitro tuberization test, transgenic plants overexpressing NnBEL6 were cultured using single stem nodes on MS medium (Murashige and Skoog 1962) supplemented with 8% sucrose at 22 °C, under a photoperiod of 8 h of light/16 h of dark (SD), with the light intensity ranging from 800 to 1000 lmol m 2 s −1 . The experiment was carried out in three biological replicates. Microtubers were harvested approximately 4 months after transplantation. To detect the endogenous expression levels of StSP6A and StSP5G in WT (wild type) 'E3' controls and transgenic lines, leaves were harvested from 3-week-old plantlets for qRT-PCR. The potato actin gene (GenBank accession number: XM006350963) was used as internal control.

Sequencing and mining of potential SNPs in lotus
A total of 138 F 2 individuals were developed from a cross between 'BG' and 'WR1'. Two parents were sequenced at a relatively high coverage using WGS. In total, 10.8 Gb (11.6-fold genome coverage) and 9.9 Gb (10.7-fold genome coverage) high-quality short-read sequences were generated for 'BG' and 'WR1', respectively (Table S2). All high-quality reads from 'BG' and 'WR1' were aligned to the repeat masked reference genome of 'China Antique' to detect putative SNPs. As results, 70.5% and 85.4% of the reads (Q > 20) from 'BG' and 'WR1' were aligned to the reference genome. A total of 949,840 and 900,010 putative SNPs were identified in 'BG' and 'WR1', respectively (Table 1). Among them, 296,706 SNPs were polymorphic between the two parents. For the polymorphic SNPs, 244,926 and 44,802 were heterozygous for 'BG' and 'WR1', respectively, indicating a high rate of heterozygosity in the maternal parent. Polymorphic SNPs markers were grouped as, aaxbb, lmxll, nnxnp, and hkxhk, based on parental genotypes, each with 27,738, 224,166, 24,042, and 20,760 markers, respectively (Table 1).
An overall genome sequence of 482.6 Gb was generated in the 138 F 2 individuals, with an average of 3.5 Gb for each individual, which was equivalent to approximately 3.8× coverage of the reference genome (Table S2). As a result, 64.2% of high-quality reads (Q > 20) of the F 2 individuals was mapped on the reference genome. Based on the parental polymorphic SNPs, a total of 285,895 SNPs were identified, with 27,527, 214,643, 23,811, and 19,914 in aaxbb, lmxll, nnxnp, and hkxhk, respectively. Moreover, these SNPs were binned as 1069, 1933, 638, and 391 markers in aaxbb, lmxll, nnxnp, and hkxhk, respectively ( Table 1).

Construction of linkage map and comparison with the draft genome
Due to high heterozygosity of both parental genomes, most loci in the F 2 progenies were heterozygous. The hkxhk marker could not be distinguished from the parents, thus it was not used for genetic linkage map construction. As a result, the bins from aaxbb, lmxll, and nnxnp SNP were merged to construct genetic linkage map (Tables 2, S3). This genetic map contained 2935 bins generated from 236,840 SNPs. These bins were distributed on eight genetic groups spanning 896.11 cM with an average marker interval of 0.27 cM. Marker distribution and linkage length varied among linkage groups. LG1 was the largest with 170.99 cM, and contained 617 markers. The shortest group LG8 was 33.30 cM, and contained 114 markers. The highest marker density was on LG6, with an average marker interval of 0.21 cM. The lowest marker density was on LG4, with an average marker interval of 0.47 cM. The largest gap among all linkage groups was 9.63 cM on LG7. The information of all markers including marker name, linkage group, genetic distance, and marker type are listed in Table S3. The genetic map was compared with the lotus draft genome (Fig. 2). The eight linkage groups were integrated on megascaffolds 1-15. All linkage groups except LG8 were anchored to at least two megascaffolds. LG1 covered the 1-35 Mb, 49-60 Mb, 73-77 Mb, 114-255 Mb regions of megascaffolds 1 and 8, respectively. LG2 was anchored to the 70-110 Mb region of megascaffolds 2 and 6, and to 1-20 Mb or 26-50 Mb region on megascaffolds 10 and 15, respectively.

QTL analysis for rhizome enlargement
The variation in phenotypic traits among segregating population was used as a key parameter for QTL mapping. The two parents, 'BG' and 'WR1', exhibited significant differences on four phenotypic traits, NER, MDE, LER and REI. 'BG' had more NER and higher REI than 'WR1' (Fig. 1). In the F 2 population, these four phenotypic traits were continuously distributed and showed significant transgressive segregation (Fig. S2). ANOVA results showed that genotype (G), environment (E), and their interactions (GxE) had highly significant effects (P < 0.001) on these traits. The G was the main factor controlling REI, and the E for LER. The total variance in NER and MDE was determined partially by the E along with the G (Table S4).
Mapping analysis for NER and REI identified 14 significant QTLs in the F 2 population in 3 climatic years (Table 3, Fig. 3A). For NER, four QTLs distributed on LG5 and LG6 were detected, with their phenotypic variation ranging from 7.5 to 14.8%. The region which covered 9.4-11.8 cM on LG6 was repeatedly found across the 3 years. Ten QTLs on LG1, LG2, LG5 and LG6 were identified for REI in the 3 years, with phenotypic variation of these QTLs ranging from 6.6 to 22.8%. A region covering a 28.4-29.9 cM on LG1 was consistently detected in 2 years, and another region covering a 12.3-13.5 cM on LG2 was detected in all the 3 climatic years (Table 3). These significant QTLs with overlapping confidence intervals were integrated into the consensus QTLs (Fig. 3A). As a result, one consensus NER QTL designated cqNER-LG1, with a positive additive effect and covering 2.4 cM, was integrated. Two consensus QTLs, cqREI-LG1 and cqREI-LG2, were identified for REI covering a 1.5 cM and 1.2 cM regions on LG1 and LG2, respectively. cqREI-LG2 explained about 20% of the phenotypic variation, and was the major QTL for REI. The two consensus REI QTLs had a positive additive effect, suggesting that alleles from the female parent of 'BG' contributes to rhizome enlargement.

Candidate genes underlying the major cqREI-LG2 QTL interval
A major cqREI-LG2 QTL showing 20% of the phenotypic variation and with a rhizome-enlargement effect was identified across the 3-year QTL mapping period. The QTL interval of cqREI-LG2 covered a 4.5 Mb region on megascaffold 6, which harbored 59 annotated genes (Table S5). Of these genes, 11 had low or no expression in all tested tissues including leaves, petiole, petals, pistils, stamens, roots and rhizomes, while the remaining 48 genes were highly   (Fig. S3). A previous transcriptome data showed an increase in the number of candidate genes controlling the conversion of stolon to rhizome between the ZT1 and ZT2 or RT2 and RT3 developmental stages in two lotus cultivars of 'RL' (tropical lotus with stolon) and 'ZO' (temperate lotus with enlarged rhizome), and these genes were considered candidate for rhizome girth enlargement (Yang et al. 2015). Of the 48 highly expressed genes in this study, nine were predicted as candidate genes for rhizome girth enlargement (Fig. 3B). Among them, four including NNU_05272, NNU_05296, NNU_05305 and NNU_05314 involved in cell and lipid metabolism were observed. A soluble starch synthase-like gene, NNU_05273, was also identified. In addition, two genes, NNU_05287 and NNU_05299, encoding ubiquitin E2 enzyme and peroxidase, respectively, were detected. Moreover, a BEL1-like homeobox protein 6 homolog gene, NNU_05289, and a bHLH transcription factor, NNU_05313, were identified. Expression levels of NNU_05273, NNU_05289 and NNU_05305 increased at the RT3 and ZT2, while the remaining six genes had comparatively low expression levels at these two stages (Fig. 3B). StBEL5 has been shown to be a transcriptional activator of StSP6A which acts as a tuberization signal in potato (Abelenda et al. 2014;Sharma et al. 2016). The BEL1-like homeobox protein 6 homolog gene, NNU_05289, designated as NnBEL6, is the most likely candidate gene that potentially control rhizome enlargement. Thus we sequenced 8485 bp of the genomic region of NnBEL6 from 1.5 kb upstream to the stop codon in the two parental genotypes and other 23 lotus accessions phenotypically classified into class I, II and III with varying REI as mentioned above (Fig. 3C). A total of 17 polymorphic sites were identified, with four in the promoter region, one in the first exon (124), ten in the first intron, one in the second intron (5923) and one in the fourth exon (6695). Only two polymorphic sites at − 1347 and 3741 were detected between class I and II lotus. The remaining 15 polymorphic sites were observed between class III lotus and those of class I or II, and were highly conserved in most cultivars, except for two lotus accessions (Fig. 3C). These results suggested that two polymorphic sites at − 1347 and 3741 in NnBEL6 could be crucial for determining rhizome phenotype in lotus. Thus, NnBEL6 is a key candidate gene regulating rhizome enlargement in lotus.

Functional validation of NnBEL6 gene in promoting lotus rhizome enlargement
NnBEL6 encodes a putative protein of 751 amino acids and contains four conserved regions: the conserved BELL domain (Ger340-Ile424), the homeodomain (Trp393-Glu540), the N-terminal SKY, and C-terminal VSLTLGL boxes (Fig. S4A). Phylogenetic analysis constructed using the alignments of NnBEL6 protein with the BEL-like genes from Arabidopsis thaliana and Solanum tuberosum revealed a close relationship between NnBEL6 and AtBEL6 or AtBEL7 (Fig. S4B). The tissue distribution and expression pattern of NnBEL6 were determined in 'ZO' with REI of 0.46, which was significantly higher than  (Figs. 1, 4). NnBEL6 was expressed in the leaf, petiole, flower bud, pedicel, rhizome and root at the stolon and initial swelling stage of 'ZO' (Fig. 4B). At the stolon stage, NnBEL6 transcript was highly accumulated in the rhizome and roots, but with decreased levels in the leaf, petiole, flower, and pedicel. At the initial swelling stage, NnBEL6 was primarily expressed in the rhizome with a fourfold higher transcript levels than that at the stolon stage. In contrast, the expression of NnBEL6 in other tested tissues showed no significant variation. NnBEL6 expression in first rhizome internode was higher in 'ZO' than that in 'WR1' at the initial swelling stage (Fig. 4C). Comparison of rhizome change and NnBEL6 mRNA levels over the developmental time-course indicated its strict correlation with the developmental process, suggesting that NnBEL6 might play a positive role in controlling the initial rhizome enlargement. Subcellular localization of NnBEL6 performed in tobacco epidermal cells indicated that NnBEL6 was a nuclear protein (Fig. S5).
To further confirm the function of NnBEL6 in the formation of storage organ, an overexpressing NnBEL6 construct driven by the CaMV 35S promoter was transformed into the potato cultivar 'E3', which can only form microtuber under SD in vitro (Fig. 5). The results demonstrated that under SD, overexpressing NnBEL6 promoted tuber formation. The transgenic lines developed microtuber at approximately 95 days after transplantation (DAT), while microtuber formation started in the WT 'E3' controls at 109 DAT (Fig. 5A, B). The transgenic lines formed an average of 3.2 microtubers per plant at 120 DAT (Fig. 5C). To dissect the effect of NnBEL6 on signal transduction in the photoperiodic tuberization pathway, we further assessed the endogenous expression of two reported key genes in the photoperiodic tuberization pathway, StSP6A and StSP5G, in the transgenic lines. The expression level of StSP6A was significantly up-regulated in 35S:NnBEL6 transgenic plants (Fig. 5D), which was consistent with the observed early tuberization result. However, the expression level of StSP5G showed varied profiles, with a slight down-regulation in two transgenic plants, and an up-regulation in one transgenic plant (Fig. 5E).

Discussion
High-quality genetic map is important for identifying genes of complex traits. A series of genetic maps have been constructed in lotus using SSR and SNP molecular markers from several segregating populations (Liu et al. 2016;Yang et al. 2012;Zhang et al. 2014). The first genetic map of lotus was constructed based on 224 SSR markers using an F 1 population derived from a cross between Asian lotus 'Chinese Antique' and American lotus 'AL1' (Yang et al. 2012). More recently, a much higher density genetic map for the same mapping population was constructed using restriction-site associated DNA sequencing (RADseq), and a total of 634 bins with 4098 SNPs were found distributed in the American lotus genetic map (Zhang et al. 2014). In addition, a double digest restriction site-associated DNA sequencing (ddRADseq) was used to construct a linkage map of 581.3 cM, with 791 bin markers consisting of 8971 SNPs sorted into eight LGs (Liu et al. 2016).
Moreover, an F 2 population and genotyping-by-sequencing (GBS) technique was used to construct a genetic map with 1475 bin markers containing 12,113 SNP markers (Huang et al. 2021). In this study, we report for the first time a whole genome sequencing (WGS) approach for detecting SNPs between two parental lotus genotypes. A total of 2935 bins consisting of 236,840 SNPs were distributed on eight LGs, spanning 896.11 cM ( Table 2). The number of markers identified and length of the linkage map in this study are higher than those previously reported, suggesting the accuracy and uniformity of SNP detected by WGS. This could be due to the direct whole genome sequencing without restriction site digestion of associated DNA tags. Thus, WGS-based SNP discovery technique is unprecedented approach for rapid construction of highdensity genetic maps, which can be genotyped on a much larger scale (Huang et al. 2009;Scheben et al. 2017), and it has been applied in the construction of linkage maps and QTL detection in species such as, Eucalyptus (Bartholome We observed a highly unbalanced distribution of SNP markers in the two lotus parents used in this study. The number of lmxll markers was extremely higher than for nnxnp markers (Table 1), this could be attributed to the unbalanced heterozygosity in parental genomes. 'BG' is a progeny of several cross between Chinese lotus, and it exhibited a high ratio of heterozygosity, while 'WR1' is a wild Thai lotus, with lower level of heterozygosity (Huang et al. 2018). This could explain the observed high SNPs density with higher ratio of heterozygosity between their genomes. Normally, only homozygous aaxbb markers are used for the F 2 population linkage map construction (Huang et al. 2021;Liu et al. 2016). However, due to the high heterozygosity of the two lotus parents used in this study, three heterozygous markers lmxll, nnxnp and hkxhk were used in the F 2 population linkage map construction. The number of heterozygous markers were much higher than the homozygous aaxbb makers (Table 1), thus using aaxbb marker could risk omitting several chromosomal coverage and breakpoints, thereby reducing the accuracy and resolution of linkage map. This assumption was validated by our results that more gap and lower coverage existed in a set of linkage maps individually constructed with aaxbb, lmxll, and nnxnp than in merged linkage map (Figs. 2, S6). This suggested that both heterozygous and homozygous markers should be used for construction of genetic maps of F 2 mapping population derived from highly heterozygous parents.
The consensus linkage map was compared with the draft genome, and approximately 67.6% of 'China Antique' draft genome was anchored onto the ten largest megascaffolds, namely megascaffolds 1-10. In addition, putative false joins within the draft genome assembly were detected (Fig. 2). Megascaffold 1 was split into eight blocks (A-H). Block A, C, E, H were linked with megascaffold 8, and assembled as the longest megascaffold. Megascaffold 6 was linked with blocks 10A, 10C and 2E, and was reassembled as a new megascaffold (Fig. S1). Similar results were also observed by fluorescence in situ hybridization (FISH) analysis in lotus (Meng et al. 2017). The FISH assay also revealed that megascaffolds 1-10 were assigned to eight chromosomes. In addition, the assay showed that megascaffolds 1 and 8 were located on the longest chromosome, while megascaffolds 6 and 10 were on the same chromosome. We speculated that some segments of the draft genome of 'China Antique' were previously misplaced, which could be due to the fact that the assembly of the draft genome was based on the genetic map of American lotus. Moreover, the small population size of 51 individuals used in the American lotus could be responsible for the failure to identify several recombination breakpoints in the chromosomes (Ming et al. 2013;Yang et al. 2012). Due to these factors, BioNano and PacBio were used to generate a more accurate draft genome of 'China Antique' (Gui et al. 2018;Shi et al. 2020). We compared the two versions of 'China Antique' genome, and found that the chromosomal rearrangements between them, which confirmed the accuracy of our generated linkage map (Fig. S7).
Rhizome is an important lotus storage organ that is also used for asexual propagation. The enlargement of rhizome is a strategy that enables lotus plants to survive the cold winter. As mentioned above, the temperate lotus undergoes four developmental stages, and produce enlarged internodes in late autumn, which can survive the winter in the temperate and sub-temperate regions (Yang et al. 2015). A previous study showed a notable phenotypic variation in rhizome branching, elongation and expansion of lotus germplasm (Lin et al. 2019). Higher REI ensures stronger survival ability against low winter temperature, thus studies on the mechanism of rhizome enlargement are vital for understanding the lotus survival and propagation. The increase in rhizome girth is associated with photoperiod and hormone, which are induced by SDs and ABA (Masuda et al. 2007a(Masuda et al. , b, 2010(Masuda et al. , 2012. It has been reported that the genes in photoperiod pathway, starch metabolism and hormone signal transduction are involved in rhizome girth enlargement Cheng et al. 2013;Yang et al. 2015). Here, we used QTL mapping to uncover the genetic determinants of lotus rhizome enlargement. Fourteen significant QTLs were identified for NER and REI on LG1, LG2, and LG6 in the 3 climatic years. Three QTL regions, cqREI-LG1, cqREI-LG2, and cqNER-LG6, which anchored to megascaffolds 1, 6 and 2, respectively, were reliable for NER and REI (Table 3, Fig. 3). Although Huang et al. (2021) also detected QTLs for REI on LG1 and LG2 in the 'WR1' and 'ZO' F 2 mapping population, these QTL were anchored to megascaffolds 3 and 4, and could only be found in a single mapping year. Thus, these QTLs identified in the previous study had no collinearity with our newly identified and highly reliable QTLs.
A BEL1-like family transcription factor, NnBEL6, located within a confidence interval of a major QTL designated cqREI-LG2, with the highest phenotypic contribution, was identified as a key candidate gene (Fig. 3). The expression of NnBEL6 was highly correlated with REI in the two cultivars of 'RL' and 'ZO' (Figs. 3, 4). Members of the BEL1-like family transcription factors are associated with aspects of plant growth such as meristem maintenance, tuberization, floral and root development, and have been identified in many plant species (Brambilla et al. 2007;Hannapel et al. 2013;Kumar et al. 2007;Sharma et al. 2014). Our study revealed that NnBEL6 is expressed in all tissues, with high transcript levels in rhizome. In addition, the level of RNA accumulation in enlarged lotus rhizome was enhanced from the stolon stage to the initial swelling stage (Fig. 4). Notably, one SNP at position 3741 and one Indel at position − 1374 showed dramatic differences between the sequences of temperate and tropical lotus cultivars, which suggest they could be functional polymorphic sites for initiating lotus rhizome enlargement (Fig. 3C). Overall, our results revealed that NnBEL6 is a key candidate gene potentially involved in rhizome enlargement of temperate lotus cultivars. StSP6A is as key tuberization signal in potato which is activated by StBEL5 (Abelenda et al. 2014;Sharma et al. 2016). In this study, ectopic expression of 35S:NnBEL6 promoted tuberization in transgenic potato (Fig. 5A-C). The endogenous expression level of StSP6A in transgenic lines was highly induced (Fig. 5D). It is speculated that NnBEL6 directly activated the expression of StSP6A and induced tuberization in potato. StSP5G is a homolog of FT that directly represses StSP6A, thus inhibiting tuberization (Abelenda et al. 2014). Here, the expression level of StSP5G showed varied profiles in overexpressing NnBEL6 potato lines, with a slight downregulation in two transgenic lines, and an up-regulation in one transgenic line (Fig. 5E). We predicted that there was no direct regulatory relationship between NnBEL6 and StSP5G, thus overexpression of NnBEL6 inconsistently influenced the expression of StSP5G in different transgenic lines. NnBEL6 and StSP5G might positively and negatively regulate their upstream gene, StSP6A, respectively. Further studies to elucidate the relationship between NnBEL6 and StSP6A are still needed. Overall, our functional analysis demonstrated the role of NnBEL6 in promoting lotus rhizome enlargement.

Conclusions
We constructed a high-density SNP genetic map of N. nucifera with 2935 markers binned from 236,840 SNPs. Fourteen significant QTLs for NER and REI were identified in 3 climatic years, which explained 6.7-22.3% of trait variance. Three QTL regions, cqNER-LG1, cqREI-LG1 and cqREI-LG2 were repeatedly identified in at least 2 years. cqREI-LG2 explained about 20% of the phenotypic variation, and was considered as a major QTL for REI. NnBEL6 was identified as a putative candidate gene underlying cqREI-LG2. The role of NnBEL6 in rhizome enlargement in lotus was validated through gene expression analysis, sequence comparison and transgenic overexpression in potato.