Development of chloroplast derived SSR markers for genus Allium and their characterization in the allies for genetic improvement of alliums

Alliums are the most popular for their culinary usage and nutraceutical bene�ts. Their productions are greatly affected by the multiple biotic and abiotic stresses. Poor characterizations of genetic resources are the major bottleneck in genetic improvement of alliums. Chloroplast derived simple sequence repeat (cpSSR) have recently gained much popularity due to their maternal inheritance and low recombination along with their hypervariable nature. In this study, 22 chloroplast-derived SSR markers were produced from chloroplast genomes of A. cepa and A. sativum. Repeat comparison revealed tri nucleotide repeats were in higher proportion (50%) compared to other repeat motifs. The number of alleles ranged from 2 to 4, heterozygosity from 0.009 to 0.540, and PIC from 0.007 to 0.427. The polymorphism survey and clustering of twenty-two cpSSR markers of twenty-ve alliums, lead to three groups (groups I, II, and III), indicated the usefulness of these cpSSR markers. This demonstrated that cultivated A. cepa and A. sativum belong to different groups II than most wild alliums, con�rming the usefulness of the AccpSSR and AscpSSR markers that will allow introduction of desirable biotic and abiotic tolerance traits from various wild alliums to selected cultivated alliums. In addition, these cpSSRs were validated in 79 alliums, divided them into three groups using Jaccard dissimilarity and Bayesian model-based structure analysis. Subsequent clustering allowed us to identify diverse alliums, for constructing core collection of germplasm resource. The study will be useful for molecular breeding and genomic selection based crop improvement.


Introduction
Onion and garlic are commercially important crops because of their culinary value around the world and their secondary metabolites, quercetine, and allicin in onion and garlic, respectively (Jones et al. 2004;Jayaswall et al. 2019aJayaswall et al. ,2019b) ) which are used for a variety of health advantages around the world.Nonetheless, the adverse effects of climate change as well as biotic factors are diminishing onion and garlic yields.Through molecular breeding, numerous features such as high total soluble content, drought and water logging tolerance, and resistance to various disease and pest, could be introgressed to popular cultivars to obtain the superior cultivars of onion (Ghodke et al. 2020;Jayaswall et al.2021;Singh etal.2020;DC et al.2021).
Different morphological and biochemical markers were used to characterize and identify resistance sources for biotic and abiotic stresses, but these were not adequate due to environmental in uences (Scholten et al. 2016;Kim et al.2021).Various genetic markers like RFLPs, RAPDs, ISSRs, SSRs, ILPs, and SNPs were employed in genotyping of alliums (McCallum et al. 2008).Furthermore, SSRs are recommended for gene mapping, gene introgression, diversity and variability studies due to co-dominance, multi-allelic, hyper-variable, and chromosomal speci c placement nature (Avise et al.1995 ;Doebley1992).These nuclear genome-derived markers, however, not conserved, are unable to identify maternally inherited relationships across cultivars, which is critical for understanding lineage and selection of multiple sources of resistance to different stresses.Furthermore, owing of their remarkable degree of sequence advancement, which prevents comparisons between sequences and allele sizes, the nuclear genome-based markers were less effective for identifying evolutionary relationships among alliums (Sharma et al. 2020a;Sharma et al.2020b).Chloroplast SSRs markers (cpSSRs) obtained from the chloroplast genome, on the other hand, have a number of advantages over nuclear genome-derived molecular markers, since these show maternal inheritance.This is due to the fact that SSR loci in the nuclear genome are predominantly distributed throughout the noncoding sections of the genome and have more sequence variants than the coding areas of the chloroplast genome, which have a low evolutionary frequency with negligible recombination rate (McCauley 1995;Provan et al. 1998).The constant nature of gene order in chloroplasts, the widespread availability of cpSSR primers, and the lack of nuclear recombination have made chloroplast genomic resources a useful tool for plant evolutionary investigations (Olmstead and Palmer 1994).The cpSSR markers of one species, are cross-transferable to other species of same family (Sharma et al. 2020b).As a result, cpSSR markers are being used widely to identify and characterize genetic links across closely related species and identi cation of resistance sources.Since, the chloroplast genomes are conserved, thus cpSSR markers can be used to identify historical bottlenecks and genetic drift effects in alliums (McCauley 1995).It has long been acknowledged that the mechanisms of evolution associated with domestication of diverse wild plants relatives has resulted in decrease of genetic variation in the genomes (Tanksley and McCouch et al. 1997).The loss of chloroplast genomic diversity in several cultivated crops has been matched by loss of nuclear diversity, as demonstrated by a variety of nuclear SSR markers (Russell et al. 2000).However, unlike conventional nuclear-based markers, the characteristic of these different conserved cpSSR molecular markers allows for a better understanding of evolutionary relationships among cultivated and wild alliums.
The chloroplast genomes of onion and garlic were recently sequenced by using various sequencing technologies (Von Kohn et al. 2013;Filyushin et al. 2016) that provides an opportunity for the development of cpSSR markers resources in onion and garlic.Previously, Jayaswall et al. (2021) mined 15 cpSSR markers from Allium paradoxum for genotyping 18 alliums.Therefore, to understand the polymorphism and cross-transferabilities of cultivated alliums derived 22 novel cpSSR markers in 25 diverse alliums and to decipher genetic diversity and population structure of 79 alliums in this study, chloroplast genome sequence of A.cepa(onion)and A. sativum(garlic)were used to develop cpSSR markers.The identi ed all 22 polymorphic and cross-transferable cpSSR markers could be used to expedite introgression of biotic and abiotic stress tolerance traits to elite cultivars of alliums, development of DNA barcodes, understand evolutionary relationships andidenti cation of progenitors of cultivated species.

Isolation of DNA and polymorphism investigation
For marker validation, the leaf samples from 104 allium genotypes were collected from the experimental eld of ICAR-DOGR,Pune,India.The genomic DNA was isolated from the collected leaf samples (Table 1a and 1b) following the method described by Murray and Thompson 1980.The quality of the isolated genomic DNA was con rmed using the lambda uncut marker (Fermentas) on 1% agarose gel, and the DNA quantity was determined using the NanoDrop 2000 (Thermo Scienti c, USA).The PCR ampli cation e ciency of developed cpSSR markers was tested with 25 ng template DNA in a volume of 20 µL reaction mixture as previously described (Bhandawat et al. 2015).The PCR program comprised of a denaturation phase at 94°C for 4 minutes, followed by for 35 cycles of three step: 94°C for one minute, with an optimized annealing temperature of primer (as given in Table 2) for 1 minute, and 72°C for 1 minute, this was followed by a nal extension at 72°C for 5 minutes in I-Cycler PCR (Bio-Rad).The PCR products were separated using 3 % Agarose gel.Amplicon sizes were measured against standard 1 kb Plus DNA ladder (O'Gene Ruler).

Markers scoring and data analysis
The presence (1) and absence (0) of bands ampli ed in alliums were recorded based on AccpSSR and AscpSSR marker pro les.Monomorphic markers produced the same size pieces throughout all alliums, but polymorphic markers produced wide-range fragments.In alliums, crosstransferred cpSSR marker amplicons were scored, and a UPGMA dendrogram was drawn using DARwin6 based on Jaccard's dissimilarity coe cient (Perrier and Jacquemoud-Collet 2006).PopGene32 (Yeh et al. 1999) and Gene Calc (Bikowski and Miks.2018) were used to calculate the heterozygosity and polymorphism information content (PIC) (Table -2).The STRUCTURE software v2.3.3 was used to disclose the genetic structure of 79 alliums (Pritchard et al. 2000;Falush et al. 2007).

Result and discussion
SSR repeats form a large component of complex genome, playing key functions in controlling gene interaction and genome packing.In the past, these repeats were regarded as "junk" DNA and were mostly utilized to uncover evolutionary relationships and identify changes in plant populations (Li et al.2002).SSRs found in the genome of chloroplast are conserved as compare to the nuclear genome (Sharma et al. 2020b).As a result, these cpSSRs are thought to be important regulators of genome evolution, and are responsible for structural and functional variability of nuclear genome.Many chloroplast genes regulate the transcription of various nuclear genes, and play important role in withstanding various biotic and abiotic stresses (Jayaswall et al.2016 andWang et al.2016).The widely consumed vegetables Allium cepa and Allium sativum, which are grown all over the world, had seen an increase in the production and productivity.But various biotic and abiotic stresses are reducing yield of cultivated alliums (Choi et al. 2020).Therefore, introgression of many desirable genes/traits/QTLs controlling adverse effect of various biotic and abiotic stresses is need of the hour.Despite this, no cpSSR maker resources of Allium cepa and Allium sativum have been reported yet except our previous study where we mined 15cpSSR markers from Allium paradoxum (Jayaswall et al. 2021).In alliums, cpSSR markers may be utilized for genetic identi cation, characterization, and introgression of desirable traits through linked molecular markers (Parida et al.2010).Therefore, nding functionally appropriate and validated polymorphic-cross transferrable cpSSR markers will help in evolutionary studies and trait introgression (Sharma et al. 2020b).

Mining of chloroplast genome and development of cpSSR markers
Mining of essential markers and genomic sequence of a crop by using digital public sequence databases is a fast and cost-effective approach.Chloroplasts play important role in the regulation of nuclear genes and different other biological processes such as evolutionary processes and pathways.Based on speci ed criteria, a total of twenty-two cpSSRs were mined and tested for ampli cation success in 25 alliums.Previously, 15 cpSSR were identi ed in the wild allium species (Jayaswall et al. 2021), but the current study represents the rst report of development of chloroplast SSR markers in cultivated alliums i.e.,Allium cepa and Allium sativum.In comparison to other types of chloroplast dwelling SSR repeats; Allium cepa and Allium sativum mainly include tri-nucleotide repeats (50%) .This prevalence of trinucleotide repeats could be related to genomic selection against frameshift mutations in the coding genes which is similar to the protein coding regions of the genome (Varshney et al. 2005, Metzgar et al. 2000).

Scoring and data analysis
All the 22 cpSSRs developed from Allium cepa and Allium sativum were ampli ed in at least one of the 25 allium individuals.The observed amplicon size of few cpSSR was more than expected which could be due the presence of intron within the ampli ed region, while few cpSSRs did not ampli ed at all which might be due to disruption of primer annealing site as observed previously (Bhandawat et al. 2015;Bhandawatet al. 2019).In contrast to nuclear genome-derived markers (Sharma et al. 2020b), the cross-transferability of cpSSR primers was found to be greater in the current investigation due to the conserved nature of sequences in chloroplast genome.In our previous study at cpSSR markers,where we had mined 15 Allium paradoxum cpSSR markers (Jayaswall et al.2021),100 % cross transferability was reported similar to the present study.Jayaswall et al. 2022 has reported 86.2 % cross-transferability by using the potential of AcPIP markers in onion.Allele number ranged from 2 to 4 which is lower to those found in our previous study of cpSSR of Allium paradoxum (Jayaswall et al.2021,) but similar to Ziziphus cpSSR markers (Huang et al. 2015).Compared to cpSSR, higher number of alleles were reported using genomic and EST-SSRs which could be due to higher sequence conservation in chloroplast genome (Ricciardi et al. 2020;Baldwin et al. 2012).In our previous work on Allium paradoxum cpSSRs, where PIC value ranged from 0.09 to 0.68 and heterozygosity from 0.104 to 0.733 (Jayaswall et al. 2021) whereas in the present study, comparatively lower PIC (0.007 to 0.427) and heterozygosity (0.009 to 0.540)(Table 2) is observed that might be due to reduction in genetic diversity, during the process of domestication and selection of cultivated Alliums.Though, among 22 cpSSRs, 11 markers showed PIC range 0.408 to 0.534, which indicates higher polymorphism and suitability for genotypic study.
Taxon analysis and understanding of genetic relationship among wild and cultivated alliums Plant molecular breeding requires the identi cation of genetic variability in crops, and an understanding of population structure aids in the selection of elite diversi ed germplasm (Chakraborty et al. 2016).The goal of a population structure study was to gure out the heterogeneity or variation in the target allium population for mapping genes and dissecting signi cant features (Wei et al. 2006).Twenty-ve alliums (17 wild alliums and 8 cultivated Allium cepa and Allium sativum) were chosen for cross-transferability and polymorphism investigations using 22 cpSSR markers.Wild alliums were found to be exceedingly variable in terms of shape, anatomy, and genetic organization ( Kik C. 2002, Nanda et al. 2016).Distance based clustering using twenty-two cpSSR markers, partitioned the twenty-ve alliums into 2 groups with one outgroup.Allium tuberosum, A.hookeriThw.,A.stulosum L., A. ladebouramun, A.senescens L., A.ampeloprasum L., A.altaicum Pall.Rottler ex.Spreng.,A.hookeriThw., A. stulosum L., A. stulosum L., A. stulosum L., A. stulosum L., A.chinense G. Don, A. macranthum Baker, A.ascalonicum L,A.carolinianum DC. and A.prezwalskianumwere clustered under group I with sub-grouping.Group II includes A.cepa L. var.Yellow Globe, A. cepa L. var.Bhima Super, A. cepa L. var.BhimaShweta, A.chinense G. Don,A.sativumL. var.CITHGo1,A.sativum L. var.Bhima Purple and A. sativum L. var.BhimaOmkarand,A.sativumL. var.G282and A.cepa L. var.Bhima Shakti. A. (Fig. 1).These groupings demonstrate that cultivated A. cepa and A. sativum are in groups II, but most wild alliums fall under groups I and III, con rming the originality and usefulness of these cpSSR markers.
Further, we have genotyped the 79 genotypes of alliums by these 22 polymorphic and cross-transferable cpSSR markers in order to validate and better understanding of the structure of wild allium.The genetic structure and grouping of 79 alliums had been con rmed using Bayesian model-based structure (Fig- 3) and Jaccard dissimilarity-based NJ tree (Fig- 2).Based on the genotypic databy these 22 cpSSR markers, 79 alliums have been divided into three groups (1,2&3).Group 1 includes A. macranthum Baker, A. chinense G. Don, A. prezwalskianum, A. schoenoprasumL., and A. cepaL.Group 2 includes A. tuberosumRottler ex.Spreng, A. hookeriThw.,A.altaicumPall., A. fragrans Vent., A.
stulosumL., and A. cepavar.aggregatumG.Don.Due to the limited number of markers and their heterogeneous nature in terms of appearance and genetic structure, A. stulosum L. ts into groups 1 and 2. Furthermore, Allium altaicum Pall belongs to groups 1 and 3 (Fig- 2).The genetic classi cation of alliums (Fig- 2) allows us to identify speci c alliums to construct core collections (Birky1988).Using these molecular markers, various biotic and abiotic stress resistant genes might be introduced into premier cultivars of A.cepa.The characteristics of chloroplast genome such as lack of recombination and generally uniparental inheritance, makes chloroplast SSR markers suitable to understand plant evolution, population genetics, phylogeny, synteny, and domestication (Sharma et al. 2020a).
The ndings imply that A. Cepa which fall into group II, could be exploited as an elite cultivar for introgression of desirable traits from donor wild alliums.Diverse wild alliums have resistance genes for various biotic and abiotic stresses (Nanda et al. 2016).As a result, information about the relatedness of various allium species within and between groups I, II, and III is drawn in a dendrogram, providing a wonderful opportunity to bring various desirable agronomical traits through marker assisted selection (Nanda et al., 2016, Birky 1998).Characterization of these different alliums will aid in breeding, conservation, and development of climate resilient alliums in the future.

Conclusion
Chloroplast derived SSRs alter the activity and function of other chloroplastic genes due to their variable repeat length, and also in uence the expression of nuclear genes, resulting phenotypic variation in alliums.Till date, in best of my knowledge, no cpSSRmarker resources of A.cepa and A. sativum containing diverse regulatory genes have been mined.As a result, the mining of a collection of 22 new chloroplast-derived microsatellite markers, as well as the experimental validation of all 22 possible polymorphic cpSSR markers, reveals their functional signi cance in A.cepa and A.sativum.Further,use of these novel polymorphic-cross-transferable cpSSR markers to characterize the diversity and population structure of 79 alliums reveals that they could be of potential use for molecular breeding studies in alliums on a larger scale.

Declarations
Con ict of interest statement Authors declared no Con ict of interest statement.
Tables Table 1a.Description of the samples used for the characterization of chloroplast simple sequence repeat markers (cpSSR).