To support sweet potato breeding programs, it is essential to assess the genetic diversity and relationships among cultivars. The ubiquity and abundance of LTR retrotransposons in the plant genome have made them valuable for studying genome-wide variation and diversity. The retrotransposon-based genetic DNA fingerprinting method could provide potentially useful genetic information. The increasing amount of sequence data released by next-generation sequencing technology provides a valuable resource for the development of retrotransposon-based markers. These retrotransposon-based markers have been applied successfully to analyze the genetic diversity in various plant species.
In the current study, we confirmed that the LTRs of sweet potato accessions contained the full complement of LTR retrotransposons. Structural analysis revealed that they are transcriptionally active and could be functional. The developed RBIP primers produced both polymorphic and monomorphic alleles that enabled the use of RBIP DNA fingerprinting to evaluate the genetic diversity among the 105 sweet potato accessions. Based on this genome-wide analysis, we found that only 12.5% of RBIP markers generated polymorphic bands, signifying that inter-LTR regions in the research genome of sweet potato accessions were significantly conserved. This implies that the sweet potato genome is still under evolution and that LTRs are not very active in contributing to genome-wide variations. To the best of our knowledge, this is the first study of genetic diversity in sweet potato using RBIP-based fingerprinting.
A previous report indicated that 7.37% of the sweet potato genome (approximately 4.4 Gb) was identified as an LTR (Si et al., 2016), while 10.987% of the sweet potato genome (approximately 4.4 Gb) was identified as a full-length LTR retrotransposon in the present study. This small difference in the number of LTRs (10.987% vs.) might be attributed to the different approaches and parameters that were used in the two studies. In our study, only putative full-length LTR retrotransposons with two very similar LTR sequences were isolated. The ratio of Ty3-gypsy to Ty1-copia can reflect the contribution in the sweet potato genome. Our results showed that full-length copia LTR retrotransposons were more common than gypsy retrotransposons in the sweet potato genome (Table 1). The ratio of Ty3-gypsy to Ty1-copia was (1:1.6), higher than that in a previous study (1:1.15)29. The numbers of full-length LTR retrotransposons in the different subfamilies were generally low, gypsy subfamilies had more single sequences (95.8%) than copia subfamilies (94.9%), and only 0.6% (49) contained more than 3 LTR retrotransposons. These findings are consistent with the results reported in other plants with different genome sizes30,31,32.
New bioinformatics software offers exciting perspectives for the development of new markers based on whole genome sequences. In the sweet potato genome, the most abundant retrotransposon families were copia and gypsy, accounting for 3.64% and 3.62%, respectively. However, simple sequence repeats contribute only 1.94% of the genome29, which indicated that RBIP markers were more ubiquitous than SSR markers in sweet potato. Although many SSR markers have been developed from sweet potato, almost all SSRs (86.1%) have mononucleotide or dinucleotide repeat motifs, and “stutter bands” or increased mutation rates in repeat lengths may create issues for using SSR markers33,34,35. RBIP markers amplify a single locus in samples, differing from SSR markers that potentially amplify two or possibly more homologous loci. Compared with other retrotransposon-based markers (IRAP, REMAP, and SSAP), which display polymorphisms in band size owing to retrotransposon insertion, RBIP markers can detect the presence or absence of retrotransposons in a locus produced by the integration of an element36.
In our 48 developed RBIP markers, 21 and 27 pairs of primers were related to the insertion of copia and gypsy retrotransposons in a particular locus, respectively. Due to the high similarity of LTRs from the same subfamily, primers designed with these LTRs may produce a same left primer sequences; for example, the left primer sequences of LTR10, LTR11, LTR13 and LTR20 were the same, but the downstream sequences were different, and this does not have an effect on the specific amplification. Diversity analysis showed that copia and gypsy LTR retrotransposons existed in all sweet potato varieties, which suggested that copia and gypsy retrotransposons existed in the sweet potato species for a long time. In several varieties, several primers did not amplify fragments, such as LTR10 in Zipibaixin, suggesting that these retrotransposons were not found in these loci. However, the insertion of copia and gypsy retrotransposons was extensively detected in most of the cultivars with more than one locus. These results implied that copia and gypsy retrotransposons replicated many times in the development of cultivated sweet potato and might explain why 10.98% of the genome was LTR retrotransposon in this research.
Additionally, previous RBIP markers of sweet potato were developed based on primer binding site construction libraries according to sequencing platforms and then LTR sequence screening. The markers generated by this method have strong specificity, and their polymorphism will be limited in other germplasm resources, so their application may be lower than that of genomic markers in genetic diversity analysis of this species. This study was based on the genome sequence of the sweet potato cultivar ‘Taizhong 6’ and included searching and screening LTR sequences. The RBIP markers we developed should have universal applicability in sweet potato resources.
According to the STRUCTURE analysis, the 105 germplasms can be divided into two groups when K = 2 in STRUCTURE (Fig. 2). The STRUCTURE results indicated that the genetic background of the 105 sweet potato germplasms had two gene pools. Almost all the germplasms had unique backgrounds, except ‘Xushu18-1’, ‘Jinguafanshu’, and ‘Taizhong6’. ‘Xushu 18 − 1’ (p330683034) was released by the Xuzhou Regional Agricultural Research Institute in 1972. It was selected from the cross between ‘Xindazi’ X ‘52 − 45’ with an inbreeding backcross, and ‘52 − 45’ was a hybrid offspring of ‘Nancy Hall’ X ‘Okinawa 100’. Previous studies have shown that most of the sweet potato cultivars have a genetic background of Okinawa 100 from Japan and Nancy Hall from the USA37,38,7,10. Based on the above research, we inferred that the two gene pools may be Nancy Hall and Okinawa 100. However, in the K = 3 model in STRUCTURE (Fig. 2), most of the accessions had one major gene pool source and a small minor gene pool, except ‘Xushu 18 − 1’, which had 2 main gene pools and a small minor gene pool. From these results, we can see that ‘Xushu 18 − 1’ has a wider genetic background than other accessions. The genetic background of sweet potato was single in Zhejiang, even among China, so it is necessary to broaden the genetic background of sweet potato varieties, enrich their genetic diversity and protect high-quality germplasm resources.
All the accessions can be divided into three groups (group I represented by green, group II by red, and group III by blue) according to the PCA results and UPGMA dendrogram (Fig. 3, Fig. 4). In group I, 7 of the 8 accessions were improved varieties, and the other 21 improved varieties were scattered in subgroups II and III. This result indicated that the genetic background of improved varieties had more similarity than other germplasms to some degree. ‘Hongpibaixin-11’ is a variety selected by local farmers and known for its phenotypic traits such as leaf shape, root tuber skin color, and root tuber flesh color. There were 11 and 9 improved varieties in groups II and III, respectively, and other landraces were scattered in the two groups. The varieties ‘Zhe 38’, ‘71438’, ‘Zhe 81’, ‘Zhe75’, and ‘Zheshu 48’, which were improved by the Institute of Crop and Nuclear Technology Utilization, Zhejiang Academy of Agricultural Sciences, clustered on the third subgroup of group I, illustrating that these varieties had similar genetic backgrounds in the hybrid combination. ‘Zhe 255’ and ‘Xinxiang’, ‘Zhezishu 5’, ‘Zheshu 6025’ and ‘Nanshu 88’ were the same.
The results of the Bayesian model using K = 2, the UPGMA dendrogram, and the PCA were highly consistent; group I contained 8 accessions, and group II consisted of 97 accessions. When K = 3, the UPGMA dendrogram and PCA results were still consistent. Group I had 8 accessions (‘Zhe 21’, ‘Zhecaishu 726’, ‘Hongpibaixin-14’, ‘Zhezishu 1’, ‘Xushu 18 − 2’, ‘Zhezishu 6’, ‘Zhe 13’, ‘Taizhong 6’), and 54 and 43 accessions were clustered in groups II and III, respectively. From the two-dimensional and three-dimensional PCAs, we found that the sweet potato germplasms in the three groups were very concentrated, and the genetic differences of the three groups were obvious. The genetic distance between Group_1 and Group_2 was 0.265, that between Group_1 and Group_3 was 0.727, and that between Group_2 and Group_3 was 0.819 (Table 5). The above data indicated that Group_2 and Group_3 had the largest genetic information, followed by Group_1 and Group_3, and Group_1 and Group_2 had the smallest differences. The pairwise fixation index (FST), as a population differentiation index determined by genetic structure, can often be used to assess genome-wide variation. The mean FST value between the three groups were 0.001, indicating that there was a very high level of differentiation between the three groups. The genetic diversity between the three groups was very large, and the germplasms in the three groups could be combined as a hybrid parent.
AMOVA showed that the source of variation among and within populations was 53% and 47%, respectively, indicating that the genetic variance was significant in Zhejiang sweet potato germplasm resources (Fig. 5, Table 4). Most of the sweet potato resources in this study have been produced in Zhejiang Province for a long time, and local environmental conditions have a significant effect on genetic variation. This result was inconsistent with those of previous studies7,9. The proportion of the total genetic variance among individuals within populations (PhiPT) value was 0.526, and P < = 0.001, showing that the total genetic variance among individuals within populations was extremely significant.
We found that several germplasms with similar phenotypes were separated into close subgroups. For example, some germplasms, ‘Hongpibaixin-7’, ‘Shenglibaihao-1’, ‘Fanshu-2’, ‘Shenglibaihao-4’, ‘shenglibaihao-2’, ‘shenglibaihao-3’, and ‘Hongpibaixin-8’, with similar phenotypic characters of leaf shape, leaf teeth type, leaf color, stem primary color, root tuber skin color, and root tuber flesh color, clustered together in the second subgroup of the second group; The germplasms ‘Hongpibaixin-4’, ‘Hongpibaixin-2’, ‘Hongpibaixin-3’, ‘Hongpibaixin-10’, ‘An’yangbaifanshu’, ‘Liushiri-1’, ‘Liushiri-2’, ‘Hongpibaixin-5’ and ‘Hongpibaixin-11’, with similar phenotypic characters of leaf shape, leaf color, leaf teeth type, leaf vein color, leaf stalk color, stem primary color, root tuber skin color, root tuber flesh color, and root tuber shape, clustered in the fifth subgroup of the second group, while ‘Hongpibaixin-11’ and ‘Hongpibaixin-4’; ‘Jinguahuang’, ‘Nanguafanshu-2’, ‘Baimahongxin’, ‘Nanguafanshu-1’, ‘Chun’anhongxin’, and ‘Jinguahuangfanshu’ were similar in leaf vein color, stem primary color, root tuber skin color, root tuber flesh color, and root tuber shape, and five of the germplasms were clustered together in the fourth subgroup of the second group. This phenomenon may be attributed to most of the germplasms collected from locals being used for planting for many years. Long-term self-retention may cause a germplasm resource to exhibit segregation of variables. The UPGMA genetic relationship reflects the difference in genetic background between germplasm resources, so selection of genetically distant accessions as hybrid parents in breeding is more likely to generate elite varieties. Our results have demonstrated the high potential of molecular marker-based parental selection in promoting genetic improvement in future sweet potato breeding programs.
However, several germplasm resources, such as ‘Shenglibaihao-1’, ‘Shenglibaihao-2’, ‘Shenglibaihao-3’, and ‘Shenglibaihao-4’, collected from different counties and cities of Zhejiang Province, called the same name (Shenglibaihao) by local people, were not similar in terms of the phenotypic characters of leaf vein color and leaf stalk color. Therefore, several germplasm resources were numbered and considered synonymous. From the results of the UPGMA dendrogram, however, we could see that those synonymous germplasm resources were not clustered together. They were not a same variety. ‘Shenglibaihao’, also named Okinawa 100, was bred in Japan and then introduced to China before the 1970s. Almost 90% of the genetic background of improved varieties in the 1960s was filial generations of ‘Shenglibaihao’37,38,7,10. The filial generations that have phenotypic traits similar to those of ‘Shenglibaihao’ may also be called ‘Shenglibaihao’ by farmer breeders, which could be the reason why there were resources named ‘Shenglibaihao-1’, ‘Shenglibaihao-2’, ‘Shenglibaihao-3’, and ‘Shenglibaihao-4’, with different variations but clustered together. The synonymous landraces ‘Hongpibaixin’, ‘Liushiri’, and ‘Fanshu’ have similar situations. The landraces ‘Hongpibaixin-2’ and ‘Hongpibaixin-3’, collected from Cangnan County, Wenzhou City, and Jinyun County, Lishui City, Zhejiang Province, China, respectively, showed 100% similarity in the UPGMA dendrogram, STRUCTURE, and PCA results. The 6 RBIP primer pairs used in this study amplified the same fragments in these two accessions (Supplementary Table 2), so we speculated that these two accessions might be synonyms. To confirm this speculation, we investigated their phenotypic traits, including leaf shape, leaf tooth type, top bud color, tip hair color, leaf vein color, petiole color, stem primary color, stem secondary color, root tuber shape, root tuber skin color, and root tuber flesh color (Supplementary Table 3), and found that they indeed have the same phenotypic characters. The same situations existed between ‘Jinguahuang’ and ‘Nanguafanshu-2’ as well as ‘Zheshu 77’ and ‘Lianhuaru’. This result revealed that ‘Jinguahuang’ and ‘Nanguafanshu-2’/‘Zheshu 77’ and ‘Lianhuaru’ were the same germplasms. However, we did not observe the same phenotypic characteristics between ‘Hongmudan’ and ‘Hongtou’, ‘Chaosheng 5’ and ‘Jinqing’, ‘Zhe 259’ and ‘Shiniuhongmudan’. The reason may be that these sweet potato germplasm resources had very similar genetic backgrounds, and more markers will be needed to confirm their relationships.