Background: Future demand for cassava is expected to increase to mitigate climatic changes, sustain food security and provide raw materials for industry. To meet these demands, adoption of modern omics methods ensures reliability, precision and timely delivery of more productive and resilient varieties. Therefore the purpose of this study was to contribute towards accurate identification of cassava accessions from a mix of duplicate clones, diverse local landraces (LARs) and improved genotypes (IMGs) in farmer fields. This is vital for cassava breeding.
Results: A total of 112 germplasms sampled through a field survey in major cassava growing regions of Kenya, were genotyped using single nucleotide polymorphisms (SNPs) markers generated through genotyping-by-sequencing (GBS) approach. Of the 33672 SNPs, 88% were anchored onto chromosomes, 3% in scaffolds and 9% could not be mapped. LD pruning and identity by state matrix estimation revealed 5808 SNPs that were used for hierarchical clustering and ADMIXTURE analysis for ancestries. Considering a sub-population of 2 - 20, a 5-fold cross-validation procedure identified 14 subpopulations present in the population from which the population structure was modeled. Approximately 48% of the germplasms were classified into 17 independent clusters as identical clones or duplicates. The remaining 52% formed admixtures and hence unique or non-duplicated clones; reducing the total number of samples surveyed from 112 to 73. Of the duplicates, 10 clusters were formed from LARs, four from IMGs, and three from a mix of both LARs and IMGs. The major and minor clusters contained 8 and 2 accessions, respectively. About 71% of clusters contained accessions from the same geographical region while 29% had accessions from different regions. The results revealed genetic relationships amongst LARs and IMGs. Duplication of LARs was attributed to historical sharing or exchange of planting materials by farmers while duplicates of IMGs could be attributed to convergent evolution, selection, or sharing of common parentage. The high number of admixtures or unique clones implied minimal loss of genetic diversity. Geographical restriction of clusters adduced to the minimal movement of planting materials across the country, perhaps linked to either inefficient seed distribution system or disease-driven quarantine measures.
Conclusions: GBS was successfully used to study the genetic relatedness of cassava genetic resources and variety identification in farmer fields. This omics approach and data herein generated could be adopted by breeders and other stakeholders in designing efficient and effective cassava improvement programs which might include the development of a core set of diagnostic markers for quality assurance, disease resistance, and targeted genomic profiling in cassava.