Unigene chromosome anchoring by SNP genotyping via MAL RNA sequencing
To create unigene markers, we collected MAL transcriptome data via frequent RNA sequencing (RNA-Seq). We identified candidate genes related to the physiological traits of each line (Abdelrahman et al., 2017) . The transcriptome sequence reads obtained for each MAL were mapped onto a doubled haploid shallot (DHA) unigene sequence. The unigene transcript levels were evaluated by RPKM (reads per kilobase exon per million mapped reads).
We performed SNP discovery and genotyping as advanced mapping data applications. SNP sites with alternative homozygous calls in bunching onion and reference homozygous calls in shallot were selected by comparing the genotype call of the transcriptome mapping data between the MAL parental lines (bunching onion and shallot). Among 56,161 DHA unigenes, sites with ≥ 4 reads coverage in all eight MALs were identified on 25,462 unigenes (Table 1). Of these, one SNP was identified in 3,278 unigenes. Those whose chromosome assignments could be completed as heterozygous genotypes were identified in only one MAL in each case. In contrast, multiple SNP sites were identified in 22,184 unigenes. Of these, 21,996 could be allocated to single physical chromosomes. Extrachromosomal MALs with heterozygous genotypes are consistent with chromosomal unigene locations with multiple SNPs. For the remaining 188 unigenes, ≥ 2 multiple SNPs were ambiguous. There were heterozygous genotypes in eight MAL types and/or parental homozygous genotype(s). The corresponding gene may have been downregulated and the shallot gene had partial homology. These unigenes were assigned to the chromosome based on other marker(s) with a single heterozygous genotype in MALs with the R indication and mapped by representative SNPs. A total of 25,462 unigenes were anchored on eight chromosomes. There were 4,513 unigenes on chromosome 2 and only 2,169 unigenes on chromosome 8.
DHA unigene information has been made public through the web database ‘Allium Transcriptome DataBase’ (TDB) at http://alliumtdb.kazusa.or.jp. We integrated the chromosome marker information onto each page of the corresponding unigene. These anchoring markers are useful in genome sequencing projects.
SNP detection in Allium cepa doubled haploids
To link the eight chromosome-specific unigene sets to genetic linkage map information, we accumulated transcriptome data from the F2 mapping population derived from a cross between the A. cepa DH lines (DHA for shallot × DHC for bulb onion). As the parental lines were doubled haploid, genotyping the mapping population should be classified as reference (DHA) homozygous, alternative (DHC) homozygous, and heterozygous. RNA sequence data were collected from 96 F2 lines (population A) of the mapping population and from DHC. The intraspecific SNPs identified by mapping DHC reads with ≥ 2 reads coverage on all 96 lines were selected for genotyping. Selecting co-dominant SNP sites with heterozygous genotypes among the 96 lines identified 16,872 SNP sites in 5,339 unigenes. One SNP site was identified on 2,109 unigenes. These genotypes were used for map calculation with an O indication meaning that one SNP site was supported. Of the 3,230 unigenes with multiple SNP sites, ≥ 2 SNP sites with identical genotype patterns on the 96 lines were identified on 1,435 unigenes. These patterns were selected as the solid genotype (S) of the corresponding unigenes. For the remaining 1,795 unigenes, inconsistencies between the homozygous and heterozygous calls were identified among the 96 lines. The representative genotype (R) was created by selecting the most abundant genotype in each of the 96 lines.
Genetic linkage map construction and physical chromosome assignment
We used the solid co-dominant genotype information obtained from 1,435 unigenes in population A to plot a genetic linkage map with JoinMap v. 4.0 (Kyazma BV, Wageningen, The Netherlands). By applying the LOD 5 cutoff, all tested markers were assigned to eight linkage groups. Based on the unigenes with anchored chromosome information, all of these could be anchored to each of the eight bulb onion chromosomes. No inconsistency was detected between each linkage group and assigned chromosome. Hence, this linkage map was reliable.
A graphical genotype list was constructed according to the unigene order information. A total of 610 genotype blocks were assigned based on the patterns of the tested 96 lines (Table S1). The remaining unigenes with O- and R-coded genotypes were allocated to the most probable genotype block and permitted genotype inconsistencies for ≤ 5 lines. A total of 1,537 O-marked and 1,426 R-marked unigenes were allocated onto the genotype blocks (Tables 2).
To confirm transcriptome-based genetic linkage map accuracy, we applied conventional PCR-based markers to the same F2 population (A). The PCR-based SSR and InDel markers were previously reported (Fischer and Bachmann, 2000; Kuhl et al., 2004; Martin et al., 2005; McCallum et al., 2012; Tsukazaki et al., 2006, 2007, 2008, 2011; Wako, 2016) [9,10,14,15,16,17,19,21,31] and used in the present study. Thirty-three markers were polymorphic between DHA and DHC. Fourteen InDel polymorphisms were detected for the sequence comparisons between DHA and DHC in Allium TDB. We designed primer sets that included these polymorphism sites and amplified them by PCR. We used 47 PCR-based markers in a linkage analysis on another F2 population (B). All linkage groups were assigned to eight physical chromosomes in MALs confirmed by amplification. These marker locations matched those in previous reports (Tsukazaki et al., 2008, 2011, 2015; Masuzaki et al., 2006a, 2006b; Wako, 2016) [16,17,18,31,34,35]. We selected 14 reliable PCR-based markers covering all eight chromosomes, applied them to population A, and integrated them onto the transcriptome-based genetic linkage map. The reconstructed map consisted of eight linkage groups with all SNP solid and PCR-based markers covering 936.6 cM. The average marker interval was 0.65 cM. All PCR-based markers were integrated onto positions corresponding to those on population B. The latter was based on a PCR marker-based linkage map. No contradiction in marker location was caused by using common markers between these maps and another linkage map previously constructed with a gynogenic population (C) derived from the same F1 hybrid between DHA and DHC (Fig. 1) (Wako, 2016) . We also compared the genetic maps against a published transcriptome-based SNP marker analysis (Duangjit et al., 2013) . Comparison of the positions of 137 SNP markers on sequences overlapping in both analyses revealed that the anchored chromosomes and relative positions were consistent for all SNP markers (Table S2). Therefore, our transcriptome-based genetic linkage map is reliable.
MALs have been used extensively to assign DNA markers to physical chromosomes (van Heusden et al., 2000b; Martin et al., 2005; Tsukazaki et al., 2008) [9,12,16]. Here, we identified chromosome-specific SNPs by comparing transcriptome data with MALs. For the first time, we used F2 populations from Allium DH parental lines. The parental line has each homozygous allele. SNPs between the parental lines DHA and DHC are easily detected. Transcriptome data from the DH lines efficiently found SNPs (Baldwin et al., 2012)  and we obtained abundant and reliable SNP information here. We constructed a reliable genetic map based on S-marked SNP markers. No inconsistency was found between the physical chromosome assignments and S-labeled markers in the linkage group. The genetic map comprised 1,435 SNP markers, one bulb onion SSR marker, and 13 InDel markers and covered 936.6 cM. To our knowledge, this map has the highest number of markers to date. Integrated linkage maps include markers associated with phenotypic characteristics for the nuclear male fertility restoration loci of cytoplasmic male sterility (Chr. 2) and bulb color (Chr. 7) (Wako, 2016) . Shallot is a genetic breeding resource for bulb onion as it produces certain distinctive chemical compounds such as saponins conferring pathogen resistance (Shigyo et al., 1997; Abdelrahman et al., 2017) [33,36]. By combining these DH lines with linkage map information, progress is anticipated in Allium molecular breeding by marker-assisted selection for several agronomic bulb onion traits.