Development and Application of a Duplex-Single Sequence Repeat Panel for Outcrossing Fertility Evaluation in Red Clover Under Open-Pollinating Conditions

Red clover (Trifolium pratense L.) is a globally signicant legume having economic value as a forage and green manure crop in temperate agricultural zones. Based on the gametophytic self-incompatibility system, and analysis of seed set rates between self and cross pollination, the clover has long been considered an outcrossing species. However, the outcrossing rates of red clover under open-pollination conditions are not denitive. Development of a reliable, timesaving, and easily usable marker system is needed to quantify and characterize rates of red clover crossing and selng. Here, genome-wide screening of 209 mapped simple sequence repeat (SSR) markers was conducted, and 185 produced clear scorable bands on a pooled DNA sample of 20 red clover accessions. Seventy markers were selected based on polymerase chain reaction (PCR) amplication quality on 24 genotypes and their relatively even distribution in red clover genome. Mean polymorphic information content for the 70 markers was 0.490, ranging from 0.117 to 0.878. From the markers, a core set of 24 loci, which had been mapped on six linkage groups, was further tested to develop 12 sets of duplex markers. Using the established duplex PCR protocol, 10 out of 12 sets of duplex SSR markers were used to genotype 60 maternal parents and their respective 22 half-sib progenies. Eight plants among the 1,320 progenies were identied to be selfed, indicating that the outcrossing rate was 99.4% in a natural environment. The protocol and the nding of this rare selng strategy under open-pollinating conditions contribute to red clover breeding efforts.


Introduction
Red clover (Trifolium pratense L.), a predominantly diploid (2n = 2x = 14) perennial species, is an important forage legume harvested for animal fodder, grown in pasture for grazing, and used as a green manure crop in temperate agricultural zones worldwide (Annicchiarico et al. 2015; Taylor and Quesenberry 1996). It has a high protein content, owing to its capability to x atmospheric nitrogen and a reduced need for nitrogen fertilizer input, thereby reducing the environmental footprint of grassland-based agriculture (Taylor and Quesenberry 1996). Red clover has been widely distributed in northeastern and southern China as a major forage legume in cultivated and natural grasslands because of its relatively high forage yield and nutritive value. Due to its insect-mediated pollination and the effective gametophytic self-incompatibility system, red clover has long been considered an allogamous species (Taylor 1982). Accordingly, individual plants produce very few or no seeds when self-fertilized. The low percentage of self-seed set, as a consequence of arti cial manipulation of owering red clover, has been reported (Bugge 1969;Orlova 1979;Williams and Silow 1933). Williams and Silow (1933) reported that only 22 of 394 self-pollinated plants produced seed and that only 55 seeds were produced from the 22 self-compatible plants; the average self-fertility of these plants was only 0.104%. In a study by Orlova (1979), the self-pollination seed-set within and between the in orescence were 3.7%-4.7% and 1.7%-2.1% in a single plant, respectively, and no seed was set in the second inbred generation. Knowledge of seed set rate under self-and open-pollination is useful for predicting mating system parameters. However, it cannot help determine whether plants derived from open-pollinated seeds originated from sel ng or outcrossing (Tan et al. 2014). In addition, quantitative estimates of the outcrossing rates under openpollination conditions are rare.
Phenotypic markers, such as ower color, have been used to estimate the sel ng and outcrossing rate in previous experiments (Cruzan 1998;Kehr 1973;Pedersen 1967). Although morphological phenotypic markers are simple and easily usable, it is di cult to obtain more than one marker per plant (Jarne and David 2008), and they are not only dependent on the genotype but may also be environmentally sensitive.
Instead, simple sequence repeats (SSRs) or microsatellites have been recognized as ideal marker tools for genotypic analysis because of their high polymorphism, co-dominance, and environmental independence (Liu and Wu 2012 (Herrmann et al. 2006). SSR markers have been proven feasible in revealing genetic differences between offspring and parents, independent of the in uence of environmental factors (Li et al. 2018;Tan et al. 2014). The available SSR linkage map allows the selection of molecular markers that cover much of the genome of red clover and make it possible to develop a duplex PCR protocol, which would reduce the time and cost of lab work by half. Multiplex PCR consists of two or more primer sets within a single PCR mixture that can simultaneously amplify separate varying regions of DNA. As a widespread technique, it has been well established in many crops to test seed purity, identify genotypes, and protect intellectual property.
However, the application of multiplex PCR in forage crops is very limited.
Development of a technically reliable and easily usable multiplex marker system would be bene cial to the quantitative estimation of sel ng and crossing rates. To date, no quantitative information has been published on the outcrossing rate under natural pollinating conditions in red clover. Thus, the aims of this study were as follows: (1) selection of a set of polymorphic SSR markers based on genome-wide screening and development of a duplex PCR-based protocol, as well as (2) application of this SSR system to determine the outcrossing rate of red clover under open-pollination environments.

Plant materials
For this study, 20 accessions of red clover, having a relatively high over-winter rate in central Inner Mongolia, were chosen from the National Medium-term Genebank of Forage Germplasm of China (Table  S1). For initial SSR marker screening, equal concentrations of DNA from the 20 accessions were combined to form a pooled DNA sample. For each accession, the DNA sample was a mixture of three plants.
In June 2018, eld plots were established in a randomized complete block design, having three replications, at the SharaQin Key Wild Scienti c Monitoring Station in Hohhot. Plots were established by planting four individuals from each accession in a 4 m 2 area. Three individual plants from eight accessions (Table S1) were randomly selected and constituted a panel (24 plants in total) for marker polymorphism analysis.
In August 2019, open-pollinated seeds were harvested by hand from a randomly selected plant in each plot, resulting in a half-sib progeny from 60 maternal plants (encoded as R1 to R60) that were sampled from the eld trial. The seeds of the corresponding samples were germinated at 20±1 °C in a petri dish, after two weeks of pre-chilling treatment. The seedlings obtained from each seed sample were transplanted into corresponding containers and grown in the greenhouse for leaf collection. Twenty-two half-sib progeny individuals [encoded as R (n)-01 to R (n)-22] were randomly selected from each maternal plant to examine outcrossing rates under natural open-pollinating eld conditions. Sixty maternal plant leaf tissue samples were hand collected individually from eld plots and then kept in a freezer at -20 °C for subsequent DNA isolation.

DNA extraction and SSR primer screening
Genomic DNA was extracted individually, from healthy leaf tissues, using a plant genome extraction kit (Tiangen, Beijing, China). The quantity and quality of DNA were checked by Nano ND-1000 spectrophotometry and 1% agarose gel electrophoresis. Each DNA sample was diluted to 20 ng/µL for usage as the working template for PCR ampli cation.
Sixty DNA testing panels were formed to perform SSR genotyping. Each panel consisted of 24 samples, including two replicates of one parent and its 22 progeny samples. A total of 209 SSR primer pairs, which were distributed evenly in the red clover genome (Table S2), were randomly selected from the study of Duplex PCR ampli cation A smaller set of SSRs was selected for duplex PCR testing based on allele size range estimates, high polymorphism estimates, and the primer compatibility of each candidate SSR marker. The criteria used to combine SSR markers for duplex PCR construction were as follows: (1) non-overlapping allele size between markers; (2) genotype performance, when ampli ed via duplex PCR; (3) high compatibility and polymorphism.
The PCR procedure was carried out as described above. The PCR reaction were carried out in a 10-µL volume containing 1 µL of template DNA (20 ng/µL), 0.5 µL of each 10 µM forward and reverse primer, 5 µL of PCR Supermix (Transgen Biotech, Beijing, China), and 2.0 µL of ddH 2 O.

Data analysis
The allele frequency, observed heterozygosity (Ho), expected heterozygosity (He), and polymorphic information content (PIC) were calculated as previously described (Liu and Wu 2012). Student's t-test was carried out using Microsoft Excel 2013. Outcrossing behavior analysis was performed by comparing SSR genotypes of the open-pollinated progeny with their respective maternal parents, following the progeny array approach (Tan et al. 2014). Apart from maternal allele(s), progeny showing one foreign band in at least two SSR loci were considered to have originated via outcrossing. If the alleles of a progeny were all derived from its maternal parent, it was considered to have originated via sel ng. Microsoft Excel was used to record data and calculate the outcrossing rates.

Screening and evaluation of SSR markers
Of the 209 SSR markers, 185 (88.5%) produced clear bands with appropriate sizes, as reported previously (Sato et al. 2005). The remaining 24 (11.5%) SSR markers produced either no amplicon products or products that were not within the estimated sizes. The allele numbers ranged from 1 to 8 among the 185 SSR markers. The average allele number per locus of dinucleotide, trinucleotide, and tetranucleotide was 3.5, 3.0, and 2.9, respectively (Table 1). There were no signi cant differences between SSR marker allele numbers with different repeat classes (t-test, p > 0.05).
After the initial SSR marker screening, 70 SSR markers were selected due to their high genotyping quality and relatively high allele numbers per locus. The 70 markers were distributed on seven linkage groups (LGs), and the number of SSR markers in each LG ranged from 6 on LG 1 to 16 on LG 4 (Table S3). Next, 24 individuals, chosen from eight different accessions, were selected to assess genotyping quality and identify candidate SSR markers for multiple duplex PCR. The number of alleles (N A ), heterozygosity, and PIC value of the 70 SSR markers are presented in Table S3. The mean number of alleles was 3.6, and the mean PIC value was 0.49, ranging from 0.117 (RCS5421) to 0.878 (RCS6128), for all 70 loci. No signi cant differences in PIC values were observed among 7 LGs (t-test, p > 0.05).

Development of a set of duplex PCR markers
Base on the principles described by Edwards and Gibbs (1994) and Hayden et al. (2008), 40 out of the 70 single-locus markers were selected for the development of duplex sets. Of the 40 tested markers, 24 SSR markers were assembled into 12 duplex sets, which were then tested on 12 individual DNA samples. The other 16 markers were discarded due to unsatisfactory ampli cation during the duplex PCR. The SSR alleles produced by duplex PCR were the same as those produced by monoplex PCRs. The 24 markers represented 10 repeat SSRs of dinucleotide, 9 repeat SSRs of trinucleotide, and 5 repeat SSRs of tetranucleotide. The mean PIC for dinucleotide, trinucleotide, and tetranucleotide SSRs were 0.610, 0.546, and 0.648, respectively ( Table 2).
The 24 markers constituting the 12 duplexes were distributed on 6 LGs, and there was no SSR marker locus contributing to the duplex PCR sets on LG 1. The number of SSR markers per LG ranged from one (on LG 6) to seven (on LG 3). The mean distance between two neighboring markers was 14.5 cM, and the closest markers were on LG 5, having a mean distance of 5.8 cM. The minimum, mean, and maximum PIC values of the 24 markers were 0.226, 0.594, and 0.781, respectively ( Table 2).

Validation of the duplex PCR in evaluation of outcrossing rate
Sixty maternal plant (R1 to R60) grown in the eld condition with totally 1320 putative half-sib progeny were genotyped for 24 loci with twelve sets of duplex PCR (Table 2). Two plants (R32-22, R58-10) out of the 1320 tested progenies were considered as contaminants because none of their ampli cations shared any maternal bands in at least two loci. The remaining 1318 progenies showed that they shared at least one maternal band and were included in further analysis.
Each half-sib family was separately genotyped to compare outcrossing and sel ng progenies (Fig. 1).
The results of the analysis showed that the mean outcrossing rate was 38.6%, when genotyping with one random marker. The cumulative outcrossing rates increased when more markers were applied, and more polymorphisms were detected in all half-sib families. When the number of marker loci was increased to 16, the cumulative outcrossing ratio rose to 98.6% (Fig. 2). Eight progeny plants did not show any foreign bands compared with those of their corresponding maternal plants; moreover, 10 progenies showed just one foreign band: these were considered putative selfed progenies. Furthermore, these 18 putative selfed progeny plants were tested by an additional two sets of SSR duplexes, eight of them remain showed no foreign bands, further indicating their self-pollination origin (Fig. 3). Thus, the outcrossing rate of red clover grown in natural pollinating environments was calculated to be 99.4% (the two contaminants were excluded).  Tang et al. 2003). In this study, based on a genome-wide selection of SSR marker loci, duplex PCRs consisting of 24 SSR markers were rst developed in red clover.

Discussion
Initially, DNA samples from 20 accessions of red clover were pooled together: the analysis of pooled DNA samples minimized the genotype number used for the preliminary screening of primers and provided robust estimates of the allele size range of marker loci. A similar strategy has previously been used in a study on soybean and switchgrass (Panicum virgatum L.) (Liu and Wu 2012;Narvel et al. 2000; Wang et al. 2011). In several related studies, SSR markers with dinucleotide repeats were found to be more polymorphic (Rongwen et al. 1995;Smith et al. 1997;Yokozeki et al. 1997). In this study, SSRs with dinucleotide repeats produced more alleles than those with trinucleotide and tetranucleotide repeats, as observed during the preliminary screening (Table 1), but the difference was not signi cant. In the nal 24 SSR markers selected for duplex PCR, dinucleotide and tetranucleotide SSRs were found to be equally polymorphic (0.610 vs. 0.648). Although the PIC value (0.546) for trinucleotide SSRs was lower than for those of the other two classes of SSRs; no signi cant differences were found among the three classes of SSRs.
For PCR-multiplexes, marker selection should integrate information on allele-length range, map position, polymorphism, and genotyping quality. Based on the aforementioned criteria, in this study, 40 out of 70 SSR markers were selected for the development of duplex PCR. Primer compatibility is another concern: some SSR primers and primer combinations are recalcitrant to multiplex PCR (Tang et al. 2003). Due to primer compatibility problems, 16 out of 40 markers were discarded. The 24 selected markers covered a major portion of the red clover genome, based on the available genetic maps. Due to the increased chance of undesirable interactions between primers, the band-size separations of individual SSR markers in each duplex combination should be wide enough to unequivocally score ampli ed alleles (Liu and Wu 2012). In this study, the threshold for the minimum difference in allele size range between markers in the same set was 34 bp (set 6), which effectively avoided an overlap between markers, which is caused by primer-primer interactions in each duplex set.
Mating system in uences the structure of genetic variability and evolutionary dynamics of population (Barrett 2003;Charlesworth and Wright 2001). As an important parameter of mating systems, estimation of the outcrossing rate is a basic step in plant population studies (Jarne and Charlesworth 1993;Schemske and Lande 1985). In the progeny array approach, Jarne and David (2008) recommended 5-10 progenies per family, for 20 families, and 5-6 loci per molecular marker. To quantify sel ng and outcrossing fertility in common bermudagrass (Cynodon dactylon (L.) Pers. var. dactylon) grown under open-pollinating conditions, 11 SSR markers were used to genotype 52-60 progeny in each of the 25 families. In the study of switchgrass, four sets of randomly selected duplexes containing eight markers could discriminate the breeding origin of each progeny (Liu and Wu, 2012). In this study, 10 sets of SSR markers were enough to accurately evaluate the outcrossing fertility of red clover under open-pollinating conditions. If progeny showing a foreign band in one locus to be considered as having originated via outcrossing, 8 sets of SSR markers were enough (our study required the progeny to show one foreign band in at least two SSR loci, to be considered to have originated via outcrossing). Thus, the 12 sets of duplex SSRs should provide a reservoir used for the breeding origin analysis in red clover.
Genetic variation in plant populations is important for individual species to adapt to environmental challenges. One way by which genetic variation is generated is outcrossing (McDonald et al. 2005). As a perennial species, red clover is widely distributed in temperate regions of the world, and adaptation to a broad range of environments made it one of the prior domestication species introduced for cultivation, which could be attributed at least partially to its near complete outcrossing mating system. Moreover, its rare sel ng strategy is bene cial when appropriate pollinators are absent (Lloyd 1979;Piper et al. 1984) and provides an advantage in transmission from female parent to offspring (Jain 1976). The presence of self-compatibility within an open-pollination environment indicates that the rare sel ng strategy is retained in the red clover even under conditions when it fully conforms to outcrossing. Although it occurs relatively less frequently, it could assist in the development of inbred lines and in setting up breeding programs for red clover.
In conclusion, based on genome-wide screening, we developed a duplex PCR system including 24 polymorphic SSR markers. The application of this test system demonstrated its high discrimination capability and effectiveness. The outcrossing rate of red clover in a natural environment was calculated for the rst time. These results are valuable for future studies involving breeding research on red clover.

Declarations
Funding: This work was supported by grants from the National Natural Science Foundation of China (31901385) and the Central Non-pro t Research Institutes Fundamental Research Funds of China (1610332020020).
Con icts of interest: The authors have no relevant nancial or non-nancial interests to disclose.
Code availability: Not applicable.
Author's contributions: All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Jun Li, Lei Liu and Xiaojing Qiang. Zhiyong Li supervised the entire study. Fan Huang and Zinian Wu performed the experiments. The rst draft of the manuscript was written by Fan Huang and all authors commented on previous versions of the manuscript. All authors read and approved the nal manuscript.