BarPlex v1.0: a multiplex PCR-based enrichment of genome-wide short segments that enable genetic studies in barley

Over the past 15 years sequencing methodologies have advanced greatly enabling high-throughput sequencing based genotyping of crop plants. In this study, we developed BarPlex v1.0, a robust and cost-ecient genotyping approach in barley (Hordeum vulgare L.). In this multiplex PCR-based amplication of ve-hundred genome-wide segments, followed by high-throughput sequencing of barcoded PCR products, we obtained hundreds to thousands of polymorphic markers. Comparison of genotyping with BarPlex v1.0 to genotyping-by-sequencing (GBS) revealed a similar genetic diversity. The polymorphic markers revealed by BarPlex v1.0 were highly accurate, with an average sequencing depth >700x and a data missing rate <0.5%. By analyzing 1,068 genotypes of wild barley (brittle rachis; Hordeum vulgare ssp. spontaneum L.), Tibetan semi-wild barley (brittle rachis), landraces, cultivars, as well as an F 2 population, this assay has been robust in studies of population diversity, variety pedigree, heterozygosity discrimination, linkage mapping, as well as genome-wide association study (GWAS). Notably, a diversity analysis in a population of Tibetan semi-wild barley suggested a close relationship with Chinese landraces, but a dramatic decrease in its genetic diversity, inferring that Tibet was not a center of domestication for the native wild barley.


Introduction
Cultivated barley (Hordeum vulgare ssp. vulgare L.) is among the four most important cereal crops for both animal and human consumption (including brewing). In 2019, the global yield production reached 3.1 ton per hectare, and was 2.3-fold higher than that of 1961 (FAO dataset, 2021). Factors contributing to this increase include a signi cant improvement in irrigation supply, fertilizer, and fungicide. Another important factor is that the use of elite varieties with improved yield production and resistance to biotic/abiotic stresses, relying on exploration and optimal use of agriculturally-important genes (e.g., barley Green Evolution genes sdw1 and uzu/BRI1, and the powdery mildew resistance gene mlo) (Buschges et al. 1997;Chono et al. 2003; Kuczynska et al. 2013). Marker-assisted incorporation of resistance genes rym4 and rym5 conferred reliable protection from barley yellow mosaic virus disease, which had been severe in Europe since the 1980s ; Stein et al. 2005).
Over the past few decades different marker technologies, from hybridization-based restriction fragment length polymorphism (RFLP) markers to PCR-based simple sequence repeat (SSR) markers, have been well established in barley (Graner et al. 2010). These markers have contributed to theoretical and applied genetic studies including assessment of population diversity, genetic mapping, map-based gene isolation (Komatsuda et al. 2007; Taketa et al. 2008), and marker-assisted breeding ). The availability of genomic resources has enabled the development of high-throughput genotyping assays such as SNP arrays (BOPA-1 and BOPA-2; Illumina 9K SNP chip; 50K Illumina In nium iSelect array) ( In this study, we deployed multiplexed PCR ampli cation, followed by high-throughput sequencing, to establish a robust genotyping assay (designated BarPlex v1.0) that targets ve-hundred genome-wide short fragments with an accumulative size of 91.1 kb, producing hundreds to thousands of polymorphic SNPs. Since this assay is cost-e cient and exible to incorporate additional segments/genetic markers which associate with particular phenotypic traits, it should gain great interest in genetic studies of barley, and provide a reference for other species.

Materials And Methods
Plant materials and phenotyping 1,068 genotypes consisting of 51 wild barley (H. vulgare ssp. spontaneum L., brittle rachis), 248 Tibetan semi-wild barley (including H. agriocrithon, the six-rowed barley with brittle rachis), 345 barley landraces, 329 barley cultivars, as well as an F 2 population with 95 segregants were analyzed. Passport information of each genotype is given (Supplementary Table 1). The F 2 population was generated by a cross between the barley accessions "Bin Hai Yu Da Mai" (National Crop Genebank of China: ZDM02420, tworowed/hulled) and "Bai Qing Ke" (ZDM04577, six-rowed/naked). Two-week-old plants were moved into a vernalization chamber for 35 days growth (4 °C with day/10 h, night/14 h), followed by cultivation under normal glasshouse conditions (day 22 °C/14 h and night 18 °C/10 h) until full maturity.
The adherence of caryopsis (naked or hulled) of each F 2 plant was scored after harvest and handthreshing. Spike brittleness for H. spontaneum and Tibetan semi-wild barley was determined as described in an earlier study (Pourkheirandish et al. 2015).

DNA extraction and quanti cation
Two-leaf-old seedlings were harvested for DNA extraction following a previously described method (Shi et al. 2019). DNA samples were quali ed by agarose gel electrophoresis and quanti ed using Qubit 3.0 (Thermo Fisher, USA). BarPlex v1.0: Primer design, PCR ampli cation, library construction, high-throughput sequencing, and bioinformatics analysis Whole-genome re-sequencing (0.6x) was conducted following a previously-established work ow (Zeng et al. 2018). Short-read mapping and calling for SNPs were performed in the identi cation of polymorphic sites between two landraces WDM06177 and ZDM00809 (NCBI accession IDs: SRR15661624 and SRR15661625). Flanking sequences of the SNP sites were extracted from the barley reference genome (Morex v2) (Monat et al. 2019), and subjected to primer pickup (size of PCR product: 140 to 350 bp) using BatchPrimer 3 (You et al. 2008). Out of 587 primers, the primers producing multiple fragments (unspeci c ampli cation) in test experiments were discarded. The rst round of PCR ampli cation using the GenoPlexs multiple PCR ampli cation kit was performed in a volume of 30 μL containing 10 ng genomic template DNA, 10 μL Genoplexs Master Mix (3x, including high-delity polymorphism and PCR buffers) as well as mixed primers of an equal molar. The PCR reaction included denaturation at 95 °C for 5 min, followed by six cycles of 30 sec at 95 °C, 4 min at 60 °C, and a nal extension at 72 °C for 5 min. PCR products were puri ed by adding 15 μL GenoPrep DNA Clean Beads solution, followed by two steps of washing with 75% ethanol. The puri ed PCR products, which serve as PCR templates, were subjected for the second round of PCR ampli cation by adding 1 μL of barcode solution and 10 μL 3x Genoplexs Master Mix, with the same PCR reaction procedure as above. PCR products were puri ed as described and eluted with 30 μL Tris-HCl solution (pH = 8). PCR products were quali ed by agarose gel electrophoresis and quanti ed using Qubit 3.0 (Thermo Fisher, USA). The barcoded PCR products were combined with an equal molar ratio, and were sequenced using Illumina Hiseq X ten or DNBSEQ-T7. the VariantFiltration function with parameters of MQ0 ≥ 4 && (MQ0 / (1.0 * DP) > 0.1) and DP < 5 || QD < 2. SNP/InDel sites with a depth <5x were marked "NA". The homozygosity or heterozygosity of SNPs/Indels were identi ed based on the allele frequency where homozygous AF ≥ 0.8 or AF ≤ 0.2, and heterozygous 0.2 < AF < 0.8. Each allele was read ≥ 5 times. Polymorphic sites were ltered (with a miss rate < 20%, minor allele frequency (MAF) > 0.01, and heterozygosity < 10%) using Vcftools v0.1.17 (Danecek et al. 2011).
Genotyping-by-sequencing (GBS) 100 ng samples of genomic DNA were digested for 2 h at 37 °C with 10 U each of PstI and MspI (New England Biolabs, Ipswitch, MA) in a 50 µL reaction volume containing 1× NEB Buffer 2.1. In-house pretreated adapters were then ligated to sticky ends by adding 30 µL of a master mix solution containing 4 U of T4 ligase (New England Biolabs) in each reaction. Samples were incubated at 22 °C for 2 h and heated at 65 °C for 10 min to inactivate the T4 ligase. Ligated samples were puri ed using DNA clean beads (Automag, USA) according to the manufacturer's instructions. Index PCR reaction was then performed in a 50 µL volume using HotStart ReadyMix (Kapabiosystems, USA). These ampli ed products were puri ed as described above and quanti ed using Qubit 3.0. Equal amounts of puri ed products were pooled for agarose gel electrophoresis. Fragments with sizes of 300-500 bp were extracted from the gel, puri ed by column tube, and quanti ed using Qubit 3.0 before sequencing on Illumina Hiseq X Ten. Out of 587 primer pairs tested for multiplex PCR reaction followed by high-throughput sequencing, 87 were discarded due to either a lower capacity of fragment capture, production of multiple fragments, or extensive/low PCR ampli cation e ciency. As a result, ve hundred primer pairs were quali ed applicable for barley multiplex PCR ampli cation (BarPlex v1.0) (Supplementary Table 2). The number of target fragments distributed on each chromosome varied from 54 to 84 (Fig. 1a), with higher density towards the telomeres (where a higher recombination rate is generally present) (Mascher et al. 2017).
We conducted four independent experiments with analysis of the 1,068 genotypes that included 51 wild barley (Hs), 248 Tibetan semi-wild barley, 345 landraces, 329 cultivars, as well as 95 F 2 segregants ( Table   1). The detection of target fragments in each experiment varied from 99.4% to 99.8% (Fig. 1b), with the average sequencing depth (x) between 467 and 1010 (Fig. 1c). For each of the 1,068 genotypes, the detection rate ranged from 96.4% to 100% with a mean of 99.7% (Fig. 1d), and the average sequencing depth across all samples was 757 (Fig. 1e). In addition, we analyzed the detection rate and sequencing depth of the 500 targets across all samples. Of these, 466 targets (93.2%) were detected in >99.5% samples, whereas only four target fragments were detected in less than 90% samples ( Fig. 1f-g). We did not observe a signi cant difference in the number or size of detected fragments in the populations of wild, Tibetan semi-wild, or cultivated barley (Table 1). In comparison to barley landraces and cultivars, fewer polymorphic target SNPs were observed in the populations of H. spontaneum and Tibetan semiwild barley. This is likely due to the origin of target SNPs which were identi ed from the population of cultivated barley. By identifying the polymorphic sites that were derived from the 91.1 kb of target sequences, a higher number of SNPs was observed in H. spontaneum, but not among Tibetan semi-wild barley. Considering the detection rate and sequencing depth, as well as the number of polymorphisms, we would like to conclude that BarPlex v1.0 is a robust and complexity-reduction assay applicable for genotyping in barley plants.  Fig. 2a). In each sample, 99.2%-100.0% (mean = 99.7%) of the target fragments were detectable using BarPlex v1.0, whereas the detection rate using GBS after quality control (removing missing rate >20%) was 81.5% -97.6% (Fig. 2b). The sequencing depth using BarPlex v1.0 was 310-fold higher than that of GBS (Fig. 2c). Each of the SNPs in BarPlex v1.0 was detected in 83.3% -100% (mean = 99.9%) of samples, whereas this number in GBS arranged from 80.2% to 100% with a mean of 91.5% (Fig. 2d-e). Based on the principal component analysis (PCA) with the dataset from either BarPlex v1.0 or GBS, a similarity of unlocking the population structure diversity was observed (Fig. 2f-g). With these results, we would like to conclude that BarPlex v1.0 is a reliable and highly accurate assay informative for genotyping in barley.
Applications for heterozygosity discrimination, variety pedigree, and linkage mapping and GWAS We further checked if polymorphisms revealed by BarPlex v1.0 were applicable in genetic studies of barley. Here, 495 target SNPs, as well as their residing fragments (90.3 Kb in total) representing 3,220 polymorphic sites were included. First, the heterozygosity revealed by quantifying the allele frequency on polymorphic sites was analyzed (Fig. 3a). All the individuals in the F 2 population remained heterozygous at a proportion of polymorphic sites (mean = 6.5%). In comparison, only ten samples of 973 in natural materials were found to remain heterozygous (cutoff: ≥4%). Moreover, we analyzed the heterozygosity revealed by target SNPs. An overestimate in comparison to that revealed by polymorphic sites, was observed (Fig. 3a).
Second, comparisons of various landraces/cultivars values using both target SNPs and polymorphic sites revealed interesting results. "Morex", a six-rowed US malting variety released in 1978, has been used as the reference genome sequence in barley (Mascher et al. 2017). "Morex" seeds provided by two breeding units in the downstream valley of Yangtze River in China differed from the original "Morex" (Fig.  3b). In contrast to this, three Chinese landraces/historic cultivars which have been cultivated widely during last century ("Chi Ba Da Mai", "Xiu Ning Ai Jiao Da Mai", and "Chi Ba Huang") were found to have a higher identity (Fig. 3c). Comparison of the two-rowed malting barley cultivar "Zao Shu 3 Hao" (Kanto Nijo 3), which was introduced from Japan in 1960s, to its Co 60 -radiation variant "Yan Fu Ai Zao 3" revealed a semi-dwarf mutation. Both "Zao Shu 3 Hao" and "Yan Fu Ai Zao 3" are founder varieties that were widely cultivated in 1970s and 1980s in China, and their genetic similarity was traceable using BarPlex v1.0 (Fig. 3d). The historic cultivar "Yu Da Mai 1 Hao" was reported to be developed from a mutation of "Zao Shu 3 Hao", whereas the genotyping result suggested it may be derived from an outpollination rather than a mutation (Fig. 3d). Notably, the polymorphic sites from the 90.3 kb of sequences showed a better resolution on variety pedigree discrimination than 495 target SNPs. . We conducted a genome-wide association study (GWAS) in 973 barley accessions (excluding 95 F 2 lines out of 1068 samples) with 3,220 polymorphic sites, and a single peak associating with the NUD locus was revealed (Fig. 4a). In addition, linkage mapping using a bi-parental F 2 population was conducted by genotyping with BarPlex v1.0 (Fig. 4b-c). Here, 108 target SNPs and 442 polymorphic sites between both parents were revealed on the seven chromosomes ( Table 1). The naked trait was delimited to a 40.2 cM genetic interval spanning approximately 20 Mb (Fig. 4d) where the previously-identi ed NUD locus was present (Taketa et al. 2008).
Collectively, these results indicated that BarPlex v1.0 is applicable in multiple barley genetic studies.
Tibetan semi-wild barley represented genetic similarity with Chinese landraces but lower diversity Tibetan semi-wild barley, also referred as Tibetan weedy barley (Zeng et al. 2018), has a characteristic brittle rachis that is also seen in wild barley H. spontaneum of the Near East. This has raised a hypothesis that Tibet may be an independent domestication center of native wild barley (Aberg, 1938 Table 1). With the polymorphic sites, principal component analysis (PCA) revealed that H. spontaneum constitutes a group isolating from domesticated barley, while the Tibetan semi-wild barley was grouped with Chinese barley landraces (Fig. 5a). A rapid decrease in genetic diversity (speci cally, nucleotide diversity, π) from Chinese barley landraces to Tibetan semi-wild barley was also observed ( Table 2). Selections represented by the estimated signi cant negative values of Tajima's D, as well as Fu and Li's D * and F * suggested a founder effect occurring in Tibetan semi-wild barley population. Statistical analysis using pre-selected target SNPs derived from cultivated barley may give an underestimation of the genetic diversity within the H. spontaneum population (Table 1), thus resulting in a deviation of the genetic relationships in principal component analysis (Fig. 5b). The genetic diversity and the PCA analysis in a large panel of Tibetan semi-wild barley, Chinese barley landraces, and H. spontaneum inferred that Tibet was not an independent center of domestication for the native wild barley. Using this strategy, we established a high-throughput genotyping assay (BarPlex v1.0) that combines multiplexed PCR (that ampli es 500 genome-wide targeted segments of barley) with high-throughput sequencing of a pool of barcoded PCR products. By genotyping thousand samples from multiple experimental runs, this assay has identi ed target SNPs (with 54 to 84 SNPs on each chromosome), as well as additional polymorphic sites on 91.1 kb of combined sequences. Importantly, this method has worked well with several types of barley (H. spontaneum, Tibetan semi-wild barley, landraces, and cultivars, as well as a segregation population) in genetic studies with multiple objectives (i.e., population genetic diversity, variety pedigree, heterozygosity discrimination, preliminary mapping, and GWAS). In contrast to the popular complexity-reduced GBS methodology, fewer polymorphisms were revealed by BarPlex v1.0. Here, only 91.1 kb of sequences were captured. Still, this assay is most attractive due to (1) accuracy and reproducibility, considering a sequencing depth of over 700x and a missing rate of less than 0.5%, (2) a customized design that enables incorporation of additional loci/markers especially associated with particular targeted traits, and (3) a lower cost for both data producing and handling (ca. 20% and 5%, if compared to the commercial price charged by GBS and KASP, respectively). Thus, we believe BarPlex v1.0 to be a choice high-throughput and robust method to genotype barley and the genotyping strategy that also works for other species.

Discussion
In using BarPlex v1.0, this study genotyped a larger panel of barley genotypes. This powerful and highthroughput method can be used to determine genetic similarity and heterozygosity using either target SNPs or polymorphic sites. This methodology was able to classify different varieties, but failed in the case of a barley variety and its mutagenized offspring. This could be resultant to the limited resolution of this assay (if compared to GBS or WGR). For population diversity analysis, we found that genotyping using pre-selected target SNPs can underestimate the genetic diversity in wild barley populations. That might be explained by ascertainment bias, similar to an earlier study that genotyped barley landraces by SNP array ( Thus, choosing a genotyping platform (BarPlex, GBS, or WGR) by consideration of laboratory effort, costs, and bioinformatics capability is highly recommended.
Cultivated barley (rough rachis) was domesticated from its wild progenitor H. spontaneum (brittle rachis) in the Fertile Crescent about 10,000 years ago. Cultivated barley carries the loss of functional alleles (btr1 or btr2) at either Btr1 or Btr2 locus (Pourkheirandish et al. 2015), while Tibetan semi-wild barley has functional Btr1 and Btr2 as in wild barley H. spontaneum. By genotyping in a large collection of Tibetan semi-wild barley accessions as well as wild barley, landraces, and cultivars, this study uncovered a dramatic decrease in genetic diversity within the Tibetan semi-wild barley population. This is consistent with a previous study involving a population of cultivated hulless barley "qingke" (Zeng et al. 2018). Without enduring the domestication bottleneck, wild progenitors (i.e., H. spontaneum) are expected to host higher genetic diversity and abundant minor alleles (Milner et al. 2019). However, neither of these characters were observed in Tibetan semi-wild barley population, thus questioning Tibet as a center of domestication for the native wild barley. Tibetan semi-wild barley accessions were mixed with Chinese barley landraces, but separated from H. spontaneum. This implied that Tibetan semi-wild barley was unlikely derived from mutations of H. spontaneum or hybridization between H. spontaneum and local cultivated barley, while both models were proposed to explain the origin of H. agriocrithon, the six-rowed Tibetan semi-wild barley (Berg 1940; Tanno and Takeda 2004). Unlocking the sequence diversity in dozens of Tibetan semi-wild barley accessions con rmed that Btr1 and Btr2 haplotypes were derived from cultivated barley, suggesting Tibetan semi-wild barley originated from hybridization between two independent domestication events (btr1Btr2 and Btr1btr2 in barley landraces), followed by combination and reoccurrence of the brittle rachis (Btr1Btr2) (Pourkheirandish et al. 2018). This explanation aligns with our results showing that a genetic similarity between Tibetan semi-wild barley and Chinese barley landraces/cultivars was higher than with H. spontaneum.

Declarations Data availability statement
The short reads generated by high-throughput Illumina sequencing have been deposited in the public NCBI database. The accession IDs for each entry are listed in Supplementary Table 1.

Con ict of interest
The authors declare no con icts of interest.   Fig. 1c).