Construction of an ultrahigh-density linkage map and graphical representation of the arrangement of transcriptome-based unigene markers on the chromosomes of Allium cepa L.

Genomic information for Allium cepa L. is limited as it is heterozygous and its genome is very large. To elucidate potential SNP markers obtained by NGS, we used a complete set of A. stulosum L.-A. cepa monosomic addition lines (MALs) and doubled haploids (DHs). These were the parental lines of an A. cepa mapping population for transcriptome-based SNP genotyping. We mapped the transcriptome sequence reads from a series of A. stulosum-A. cepa MALs onto the unigene sequence of the doubled haploid shallot A. cepa Aggregatum group (DHA) and compared the MAL genotype call for parental bunching onion and shallot transcriptome mapping data. We identied SNP sites with at least four reads on 25,462 unigenes. They were anchored on eight A. cepa chromosomes. A single SNP site was identied on 3,278 unigenes and multiple SNPs were identied on 22,184 unigenes. The chromosome marker information was made public via the web database Allium TDB (http://alliumtdb.kazusa.or.jp/). To map these markers, we gathered RNA sequence data from 96 lines of a DHA × doubled haploid bulb onion A. cepa common onion group (DHC) mapping population. After selecting co-dominant SNP sites, 16,872 SNPs were identied in 5,339 unigenes. Of these, at least two SNPs with identical genotypes were found in 1,435 unigenes. We developed a linkage map using genotype information from these unigenes. All unigene markers mapped onto the eight chromosomes and graphical genotyping was conducted based on the unigene order information. Another 2,963 unigenes were allocated onto the eight chromosomes. To conrm the accuracy of this transcriptome-based genetic linkage map, conventional PCR-based markers were used for linkage analysis. unigene sets to genetic linkage map information, accumulated transcriptome data from the F 2 mapping population derived from a cross between the A. cepa DH lines (DHA for × DHC for bulb onion). As the parental lines were doubled haploid, the population classied (DHA) homozygous, alternative (DHC) homozygous, and heterozygous. RNA sequence data were collected from 96 F 2 lines (population A) of the mapping population and from DHC. The intraspecic SNPs identied by mapping DHC reads with ≥ 2 reads coverage on all 96 lines were selected for genotyping. Selecting co-dominant SNP sites with heterozygous genotypes among the 96 lines identied 16,872 SNP sites in 5,339 unigenes. One SNP site was identied on 2,109 unigenes. genotypes were used for map with O indication that one SNP was supported. Of the 3,230 unigenes with multiple SNP sites, ≥ 2 SNP sites with identical genotype patterns on the 96 lines were identied on unigenes. These patterns were selected as the solid genotype (S) of the corresponding unigenes. For the remaining 1,795 unigenes, inconsistencies between the homozygous and heterozygous calls were identied among the 96 lines. The representative genotype (R) was created by selecting the most abundant genotype in each of the 96 lines. and genotype information is a holistic approach towards Allium gene expression analysis for plant breeding and an effective, low-cost method of developing novel disease-resistant Allium varieties.


Background
The genus Allium comprises economically important vegetable crops such as bulb onion (A. cepa L.), garlic (A. sativum L.), bunching onion (A. stulosum L.), leek (A. porrum L.), and numerous wild species (Hanelt, 1990) [1]. Bulb onion is a major vegetable crop worldwide. According to the FAOSTAT database, global bulb onion production was ~ 96 million t and ranked second after tomatoes in terms of vegetable crop cultivation in 2018 [2]. Allium cepa L. consists of the common onion (bulb onion) and the Aggregatum (shallot) groups (Jones and Mann, 1963) [3]. Shallot is also an important vegetable crop and is cultivated mainly in Europe, Southeast Asia, and Africa. Though it differs morphologically and ecologically from bulb onion, both are easily crossed (Astley et al., 1982) [4]. Shallot has a short growing period and is resistant to Fusarium oxysporum (Vu et al., 2012) [5]. Hence, analysis of its genome might generate valuable information applicable to bulb onion breeding. The latter is time-consuming and labor-intensive as bulb onion is a biennial and heterogeneous because of severe inbreeding depression. To facilitate bulb onion breeding efforts, then, it is necessary to develop effective methods such as DNA marker-assisted selection.
The advent of next-generation sequencing (NGS) has realized the accumulation of large amounts of sequence information and the identi cation of numerous single-nucleotide-polymorphisms (SNPs) to develop markers in plants with large genomes (Takahagi et al., 2016) [23]. NGS has been used to generate SNP markers in bulb onions via transcriptomic and selected genomic region bases (Duangjit et al., 2013;Jo et al., 2017;Choi et al., 2020) [24][25][26]. Numerous SNPs were identi ed by these approaches. However, only 597, 202, and 319 SNP markers were anchored on each genetic map, respectively, because of parental line heterozygosity in the mapping population. For the effective use of NGS technology, plant materials with su cient homozygosity must be applied to the parental lines of the mapping population.  [27][28][29]. For Allium, we developed shallot and bulb onion DH lines and their F 1 hybrids for use in genetic analysis (Abdelrahman et al., 2015;Wako, 2016) [30,31]. We also developed several bunching onion (Allium stulosum L.)-shallot monosomic addition lines (MALs) (Shigyo et al., 1996) [32]. These have been used to assign genetic linkage maps to A. cepa chromosomes by seeking shallot-type alleles among the eight MALs (van Heusden et al., 2000b; Martin et al., 2005) [9,12]. The combination of these plant resources could enhance potential SNP genotyping by NGS.
Here, we performed a transcriptome analysis on MALs to generate information about chromosome-speci c unigene markers. We conducted transcriptomics on the F 2 population derived from a cross between shallot and bulb onion DH lines. We also constructed a high-density genetic linkage map by elucidating the potential SNP sites generated by NGS.

Results And Discussion
Unigene chromosome anchoring by SNP genotyping via MAL RNA sequencing To create unigene markers, we collected MAL transcriptome data via frequent RNA sequencing (RNA-Seq). We identi ed candidate genes related to the physiological traits of each line (Abdelrahman et al., 2017) [33]. The transcriptome sequence reads obtained for each MAL were mapped onto a doubled haploid shallot (DHA) unigene sequence. The unigene transcript levels were evaluated by RPKM (reads per kilobase exon per million mapped reads).
We performed SNP discovery and genotyping as advanced mapping data applications. SNP sites with alternative homozygous calls in bunching onion and reference homozygous calls in shallot were selected by comparing the genotype call of the transcriptome mapping data between the MAL parental lines (bunching onion and shallot). Among 56,161 DHA unigenes, sites with ≥ 4 reads coverage in all eight MALs were identi ed on 25,462 unigenes (Table 1). Of these, one SNP was identi ed in 3,278 unigenes. Those whose chromosome assignments could be completed as heterozygous genotypes were identi ed in only one MAL in each case. In contrast, multiple SNP sites were identi ed in 22,184 unigenes. Of these, 21,996 could be allocated to single physical chromosomes. Extrachromosomal MALs with heterozygous genotypes are consistent with chromosomal unigene locations with multiple SNPs. For the remaining 188 unigenes, ≥ 2 multiple SNPs were ambiguous. There were heterozygous genotypes in eight MAL types and/or parental homozygous genotype(s). The corresponding gene may have been downregulated and the shallot gene had partial homology. These unigenes were assigned to the chromosome based on other marker(s) with a single heterozygous genotype in MALs with the R indication and mapped by representative SNPs. A total of 25,462 unigenes were anchored on eight chromosomes. There were 4,513 unigenes on chromosome 2 and only 2,169 unigenes on chromosome 8.
DHA unigene information has been made public through the web database 'Allium Transcriptome DataBase' (TDB) at http://alliumtdb.kazusa.or.jp. We integrated the chromosome marker information onto each page of the corresponding unigene. These anchoring markers are useful in genome sequencing projects.

SNP detection in Allium cepa doubled haploids
To link the eight chromosome-speci c unigene sets to genetic linkage map information, we accumulated transcriptome data from the F 2 mapping population derived from a cross between the A. cepa DH lines (DHA for shallot × DHC for bulb onion). As the parental lines were doubled haploid, genotyping the mapping population should be classi ed as reference (DHA) homozygous, alternative (DHC) homozygous, and heterozygous. RNA sequence data were collected from 96 F 2 lines (population A) of the mapping population and from DHC. The intraspeci c SNPs identi ed by mapping DHC reads with ≥ 2 reads coverage on all 96 lines were selected for genotyping. Selecting co-dominant SNP sites with heterozygous genotypes among the 96 lines identi ed 16,872 SNP sites in 5,339 unigenes. One SNP site was identi ed on 2,109 unigenes. These genotypes were used for map calculation with an O indication meaning that one SNP site was supported. Of the 3,230 unigenes with multiple SNP sites, ≥ 2 SNP sites with identical genotype patterns on the 96 lines were identi ed on 1,435 unigenes. These patterns were selected as the solid genotype (S) of the corresponding unigenes. For the remaining 1,795 unigenes, inconsistencies between the homozygous and heterozygous calls were identi ed among the 96 lines. The representative genotype (R) was created by selecting the most abundant genotype in each of the 96 lines.

Genetic linkage map construction and physical chromosome assignment
We used the solid co-dominant genotype information obtained from 1,435 unigenes in population A to plot a genetic linkage map with JoinMap v. 4.0 (Kyazma BV, Wageningen, The Netherlands). By applying the LOD 5 cutoff, all tested markers were assigned to eight linkage groups. Based on the unigenes with anchored chromosome information, all of these could be anchored to each of the eight bulb onion chromosomes. No inconsistency was detected between each linkage group and assigned chromosome. Hence, this linkage map was reliable.
A graphical genotype list was constructed according to the unigene order information. A total of 610 genotype blocks were assigned based on the patterns of the tested 96 lines (Table S1). The remaining unigenes with Oand R-coded genotypes were allocated to the most probable genotype block and permitted genotype inconsistencies for ≤ 5 lines. A total of 1,537 O-marked and 1,426 R-marked unigenes were allocated onto the genotype blocks (Tables 2).
To con rm transcriptome-based genetic linkage map accuracy, we applied conventional PCR-based markers to the same F 2 population (A). The PCR-based SSR and InDel markers were previously reported (Fischer and  [9,10,14,15,16,17,19,21,31] and used in the present study. Thirty-three markers were polymorphic between DHA and DHC. Fourteen InDel polymorphisms were detected for the sequence comparisons between DHA and DHC in Allium TDB. We designed primer sets that included these polymorphism sites and ampli ed them by PCR. We used 47 PCR-based markers in a linkage analysis on another F 2 population (B). All linkage groups were assigned to eight physical chromosomes in MALs con rmed by ampli cation. These marker locations matched those in previous reports (Tsukazaki et al., 2008(Tsukazaki et al., , 2011(Tsukazaki et al., , 2015Masuzaki et al., 2006aMasuzaki et al., , 2006bWako, 2016) [16,17,18,31,34,35]. We selected 14 reliable PCR-based markers covering all eight chromosomes, applied them to population A, and integrated them onto the transcriptomebased genetic linkage map. The reconstructed map consisted of eight linkage groups with all SNP solid and PCR-based markers covering 936.6 cM. The average marker interval was 0.65 cM. All PCR-based markers were integrated onto positions corresponding to those on population B. The latter was based on a PCR marker-based linkage map. No contradiction in marker location was caused by using common markers between these maps and another linkage map previously constructed with a gynogenic population (C) derived from the same F 1 hybrid between DHA and DHC ( Fig. 1) (Wako, 2016) [31]. We also compared the genetic maps against a published transcriptome-based SNP marker analysis (Duangjit et al., 2013) [24]. Comparison of the positions of 137 SNP markers on sequences overlapping in both analyses revealed that the anchored chromosomes and relative positions were consistent for all SNP markers (Table S2). Therefore, our transcriptome-based genetic linkage map is reliable. obtained abundant and reliable SNP information here. We constructed a reliable genetic map based on Smarked SNP markers. No inconsistency was found between the physical chromosome assignments and Slabeled markers in the linkage group. The genetic map comprised 1,435 SNP markers, one bulb onion SSR marker, and 13 InDel markers and covered 936.6 cM. To our knowledge, this map has the highest number of markers to date. Integrated linkage maps include markers associated with phenotypic characteristics for the nuclear male fertility restoration loci of cytoplasmic male sterility (Chr. 2) and bulb color (Chr. 7) (Wako, 2016) [31]. Shallot is a genetic breeding resource for bulb onion as it produces certain distinctive chemical compounds such as saponins conferring pathogen resistance (Shigyo et al., 1997; Abdelrahman et al., 2017) [33,36]. By combining these DH lines with linkage map information, progress is anticipated in Allium molecular breeding by marker-assisted selection for several agronomic bulb onion traits.

Conclusions
In the present study, we constructed an ultrahigh-density linkage map in Allium cepa using numerous SNP markers obtained from the transcriptome information of the Allium DH lines and the MALs. As DH techniques depress inbreeding, they are useful for making homozygous pure lines that resemble inbred lines (Abdelrahman et al., 2015) [30]. Though bulb onion and shallot have different characteristics, they both belong to A. cepa and are easy to cross (Astley et al., 1982) [4]. The MALs have all A. stulosum chromosomes and one A. cepa chromosome (Shigyo et al., 1996) [32]. We performed a transcriptome analysis to identify unigenes and assign them to physical chromosomes. To this end, we compared shallot DH and MAL transcriptome data. We then used the F 2 mapping population between bulb onion DH and shallot DH to detect SNP sites. A total of 16,872 SNP sites were identi ed on 5,339 unigenes. Of these, 1,435 were selected as the solid genotype of the corresponding unigenes. By constructing a linkage map with SNP solid markers, all markers were mapped and the locations between the physical chromosomes and linkage groups were consistent. The number of SNPs located on the linkage map was much higher than those previously reported. Thus, the linkage map resolution was high. Furthermore, linkage maps integrated with PCR-based markers are now available. Shallots produce chemical compounds conferring resistance to certain bulb onion diseases (Abdelrahman et al., 2017) [33].
Hence, connecting phenotype and genotype information is a holistic approach towards Allium gene expression analysis for plant breeding and an effective, low-cost method of developing novel disease-resistant Allium varieties.

Competing interests
The authors declare that they have no competing interests.

Funding
This work was supported by JSPS KAKENHI Grant Number JP26292020.

Author contributions
All authors have read and approved the submitted version of the manuscript, and have agreed both to be personally accountable for the author's own contributions and to ensure that questions related to the accuracy or integrity of any part of the work, even ones in which the author was not personally involved, are appropriately investigated, resolved, and the resolution documented in the literature. Authors' contribution are, 1.   Figure 1