Agriculture is the mainstay of millions of low-income households in Sub-Saharan Africa (SSA). However, productivity is way below the yield potential of significant crops because several interacting factors contribute to the yield reduction. The paucity of nutritionally-improved and resilient varieties is a crucial constraint that can be mitigated by the rapid development of cultivars adapted to specific agroecology zone. The current yield gain trend in major food crops has shown that relying on conventional breeding alone is insufficient to meet the food need of an estimated 9 billion people in 2025 [1]. There is a need to accelerate genetic gain by deploying new breeding strategies [2, 3]. This need has led to the scientific community's massive investment in developing genomic resources and support systems to provide useful tools to accelerate the breeding process and develop high yielding cultivars [4]. In the form of marker-assisted selection, including genomic selection (GS), molecular markers are one of such genomic tools that are commonly used for molecular breeding. The availability of DNA-based molecular markers has provided a means for better discrimination and faster determination of genetic purity of inbreds and hybrid verification. They are useful for varietal identification, diversity analysis, and genomics-assisted breeding.
Molecular markers are advantageous over morphological markers because they are unaffected by environmental conditions or the plant's physiological stage.
Simple sequence repeats (SSR) have been the preferred marker for assessing genetic purity because of their multi-allelic and co-dominant nature, high polymorphism, and wide distribution throughout the genome [5]. However, the latest advances in molecular biology technologies have stressed the use of single nucleotide polymorphic (SNP) markers [6]. SNP markers have become powerful tools for genetic discovery research and marker-assisted breeding for crop improvement [7, 8]. They are abundant in the genome, co-dominant in nature, require a simple documentation system, low genotyping cost, and amenable to high throughput analysis [9]. Besides their cost-effectiveness, speed, and flexibility, a unique advantage of SNP markers is the availability of various medium to high throughput genotyping platforms that address different breeding needs for varying marker densities and cost per sample. High throughput platforms such as the sequence-based Next-Generation Sequencing (NGS) technologies (genotyping by sequencing, GBS) [10], array-based sequencing technologies (diversity arrays technology, DArTseq, and Sequenom) [11], and chip-based technologies have made SNPs very popular for various applications. The application of molecular markers in crop breeding has significantly impacted the development of major food crops, including maize [12, 13, 14, 15, 16]. The Maize Improvement Program (MIP) of the International Institute of Tropical Agriculture (IITA) is using molecular markers in various stages of the breeding pipeline including, quality control (QC) for hybrid verification, fingerprinting parental lines to assess genetic diversity, and using DNA fingerprinting for adoption tracking of improved varieties [17]. Breeders start their work by assessing the extent of genetic variation in their germplasm collection for the valid selection of parents for various breeding objectives. Marker-based diversity analysis is commonly used to assess genetic variation and heterotic grouping [17]. Once crosses are made between selected parents, hybrid verification can be performed with molecular markers rapidly and at a low cost. Progeny selection based on various molecular breeding schemes requires a different set of markers. For instance, GS is done with NGS-based genome-wide markers such as GBS or DArTseq, while marker-assisted backcrossing (MABC) or forward breeding utilizes selected predictive markers strongly linked to priority traits.
However, various bottlenecks have hindered the substantial impact of molecular breeding for crop improvement, particularly in developing countries [18, 19]. The major limiting factors are lack of infrastructure and capacity for genomics resources and poor information flow, resulting in reduced access to operational and decision support tools [17]. Because private companies in developed countries own the proprietary rights to many of the emerging genomics resources and systems, it is difficult for public research sectors, non-profit research institutes, and small laboratories in developing countries to have direct access. These challenges are being curbed by various international initiatives such as the Excellence in Breeding (EiB) platform, which coordinates its activities with the Genomic and Open-source Breeding Informatics Initiative (GOBii), and High Through-Put Genotyping (HTPG). There is also the Integrated Breeding Platform (IBP)-hosted Generation Challenge Program (GCP) and the Breeding Management System (BMS) [20], which target the development and adoption of molecular breeding in developing countries. These and other Consultative Group-hosted initiatives and platforms galvanize worldwide partners drawn from public, private, and governmental institutions towards the common goal of increasing agricultural productivity through efficient tools, technologies, and data management systems.
Despite the availability of many low-cost genotyping platforms and resources, it is not easy to meet the genotyping needs of many users who work on different crops and in different locations. There is the need to complement these international initiatives by providing in-house or local (regional) genotyping platforms, where possible, to accelerate the genotyping workflow. One such regional initiative in Africa is the Integrated Genotyping Support Services (IGSS) genotyping facility at BeCA/ILRI (Biosciences eastern and central Africa/International Livestock Research Institute), Kenya. Because these international initiatives are grant driven and short-term projects, it is imperative to devise a sustainable strategy for routine, cost-effective, and easily accessible genotyping service. Such a strategy will provide breeders with either outsourcing to genotyping service providers or setting up a core facility in-house.
One factor that influences breeders' choice of genotyping platform is the level of throughput. Other factors considered are the data turnaround time, ease of data analysis (available informatics), reproducibility, flexibility, and cost per datapoint or cost per sample [21, 22]. For high and ultra-high throughput markers, breeders outsource to an array- and sequenced-based genotyping service providers. These platforms are suitable for discovery applications and for approaches requiring hundreds to thousands of samples to be genotyped with tens to thousands of markers, such as Genome-wide association studies (GWAS), gene mapping, and large-scale genomic selection. They are also suitable for genotyping a few samples with many markers (multiplexing), such as genetic diversity analysis or background selection. While multiplex platforms provide higher throughput with lower reagent consumption, it limits scientists to using a multiplexed set of several thousand SNPs per assay. The high per sample cost of highly multiplexed platforms can be problematic for crop improvement applications, which usually requires low- to medium-density markers, and the time consumed for initial assay development [22]. They are also costly in informatics needs and presently produce datasets with a significant percentage of missing data [23]. Multiplex platforms are not suitable for small MAS projects, which genotype small-volume samples with few target markers. For example, in MABC, few markers may be used to introgress a particular gene. Also, for genetic identity and purity analysis of inbreds, a small volume of samples is genotyped with tens to a hundred markers. In hybrid verification studies, a subset of carefully developed SNP markers can be used. Also, in variety identification and adoption tracking studies, where a library of released varieties fingerprinted with a selected set of SNPs is required, the multiplex platforms may not be cost-effective. For these low- to mid-density genotyping approaches, a uniplex SNP genotyping platform is appropriate.
Uniplex genotyping assays are small-scale high throughput genotyping systems that are ideally flexible regarding assay design, ease of running, and cost-effectiveness. These systems provide plant breeders with the flexibility to mix and match different SNPs for a given sample set. They allow breeders to use a smaller subset of informative SNPs such as functional SNPs and trait-specific haplotypes, thereby eliminating the generation of unintended datapoints when using fixed array SNPs. There exists a range of uniplex SNP genotyping assays developed, especially for the medical field. However, the most competitive uniplex systems that have been successfully applied in crop improvement research are TaqMan [24, 25, 26], KASP [22, 27], Amplifuor [28], and rhAmP [29] assays. These uniplex genotyping systems vary in reaction chemistry, detection method, and reaction format. Uniplex systems can either be outsourced or installed in-house.
This study utilized the Kompetitive allele-specific PCR (KASP) assay because it is one of the most used assays among plant breeders and biologists. KASP is an endpoint PCR-based SNP genotyping method from KBiosciences, now LGC Biosearch Technologies, UK. KASP uses fluorescently labeled allele-specific primers for the bi-allelic discrimination of SNPs and INDELs. KASP was developed to reduce cost, mainly from probe design, and improve genotyping efficiency and has become a preferred alternative to TaqMan [30, 22]. The KASP genotyping system has been successfully applied in crops such as maize [22, 31], wheat [21, 27, 32], rice [33], Soybean [34], peanut [35], amongst others. KASP has developed into a global benchmark technology for genotyping crop plants [22, 35, 36, 37, 38] following the validation of KASP markers across crops of global importance (such as maize- 1250 markers, wheat-1864 markers, and rice-2015 markers) by the Generation Challenge Program of the Integrated Breeding Platform [20]. CIMMYT has successfully utilized the 1250 maize KASP markers for various genetic applications, including quantitative trait loci (QTL) mapping, Marker-assisted recurrent selection (MARS), allele mining, and QC analysis [22].
The Maize Improvement Program of IITA has generated over 17,000 datapoints using KASP for different genotype analyses, including QC and MAS. However, some bottlenecks in the genotyping workflow slow down the genotyping process, delaying crop improvement: (1) method of sample collection and processing (freeze-drying, grinding, and sample prep), (2) DNA quality and quantity analysis, (3) DNA-based genotyping, and (4) data analysis. Gedil and Menkir (2019) provided a thorough review of the MIP group's molecular marker-based crop improvement activities. However, reports are lacking in accelerating the entire genotyping process by minimizing these bottlenecks and providing a cost-effective genotyping workflow suitable for small scale breeders and laboratories in developing countries. This work aimed to show how deploying an in-house genotyping platform alongside optimized molecular techniques can accelerate maize improvement activity for a quick result turnaround—providing a workflow that can also be adopted by less sophisticated breeding laboratories in developing countries for efficient crop improvement.
Hence, our objective was to develop an efficient and cost-effective genotyping workflow that removes or minimizes significant bottlenecks in the molecular marker-based genotyping process viz: optimized leaf sample collection, high-quality DNA extraction, optimized high throughput DNA quantitation, and low-cost in-house KASP genotyping system; and to deploy the workflow to ascertain genetic identity among adopted maize varieties and verify parentage of F1 maize lines using a panel of QC SNP markers, and trait markers for MAS of high PVA and aflatoxin resistant maize lines..