Cultivated peanut (Arachis hypogaea L.; 2n = 4x = 40) is an important oil and cash crop that is grown worldwide, with an annual global yield of over 48 million tons (FAOSTAT, http://www.fao.org/faostat/en/#data/QC, 2019). Because of their unique biological characteristics, peanuts must be shelled before they can be used (e.g., oil extraction, food processing, and seeding). Shelling refers to a process that breaks the peanut shell and separates it from the kernel [1, 2].
Pod shell thickness (PST), which is calculated as the distance between the exocarp and the mesocarp of peanut, is an important peanut shell trait that affects peanut processing and resistance to pest infestations [3]. Thick-shelled peanuts are suitable for storage, but breaking and removing their shells can be difficult (i.e., decreased shelling efficiency), whereas thin-shelled peanuts are appropriate for shelling, but they are susceptible to pest infestations and their seeds may be damaged during shelling [4, 5]. Therefore, optimizing PST is conducive to maximizing the mechanized shelling efficiency and peanut quality, while also improving peanut storage and processing, potentially leading to increases in the economic value of peanuts.
A previous study revealed that PST is a complex quantitative trait with a normal distribution [3]. Pod shell thickness is typically determined by measuring the thickness of the pod waist [6], the ridge of the pod posterior chamber [7, 8], the pod stalk [9], and the most convex part of the posterior chamber [3] using a digital vernier caliper. Guo et al. were the first to use vernier calipers to measure the mature pod waist and calculate the shell thickness. Moreover, they used 204 chromosome segment substitution lines obtained from a cross between the recurrent parent Qinhuangdaoguangyang and the donor parent Shiyaodou to identify quantitative trait loci (QTLs) for PST, among which two QTLs were detected on chromosomes 2 and 15, with phenotypic variance explained (PVE) values of 7.65% and 9.00%, respectively [6]. Li et al. used a recombinant inbred line (RIL) population comprising 151 lines derived from 79266 and D893 to identify QTLs for PST; 14 QTLs were detected in seven environments (PVE value of 6.90–23.16%), while qPST12 on chromosome 12 was detected in three environments (PVE value greater than 20%) [7]. Liu et al. analyzed a RIL population consisting of 441 lines derived from Shanhua15 and Zhonghua12 and identified four QTLs on chromosomes 7, 8, 13, and 17 (PVE value of 1.75–3.22%) [8]. Yang et al. examined the 267 lines of a RIL population derived from Yueyou92 and Xinhuixiaoli and detected three QTLs on chromosomes 4, 7, and 13 (PVE of 7.36–11.61%) [9]. In a recent study, one stable major QTL for PST (qAHPS07) was finely mapped to a 36.46-kb physical interval on chromosome 7 [10]. However, PST measurements are generally imprecise and the complex genetic mechanisms underlying this trait remain relatively uncharacterized.
A bulked segregant analysis (BSA), which involves the pooling of samples with extreme phenotypes, may be useful for cost-effective and highly efficient marker-based screening or QTL mapping [11]. During the last decade, advances in next-generation sequencing technologies and decreases in sequencing costs have enabled researchers to gradually develop and improve BSA sequencing (BSA-seq) methods and technical systems for analyses of multiple crops, including maize [12], soybean [13], rice [14], and wheat [15]. They have also been used in studies on various peanut traits, including testa color) [16], branch number [17], sucrose content [18], and pod size [10].
Four kinds of statistical algorithms can be used to identify the single nucleotide polymorphisms (SNPs) or genomic regions associated with target traits by BSA-seq analysis. The SNP-index is a classical algorithm. Specifically, Δ(SNP-index) is used to calculate the difference in the genotype frequency between mixed pools. A strong association between the genomic region and the target trait is reflected by a Δ(SNP-index) value close to 1 [19]. The Euclidean distance (ED) algorithm is used to identify the SNP sites that differ significantly between two mixed pools as well as to evaluate the regions associated with traits. Theoretically, in the two mixed pools constructed for a BSA, all loci except for those related to the target trait tend to be the same. Thus, the ED of non-target loci should be close to 0. The magnitude of ED represents the degree of the difference in the markers between the two mixed pools [20]. The G value is a modified statistic obtained after smoothing the G statistic, which is useful for mapping a relatively narrow region. The G value of each SNP is calculated on the basis of the allele sequencing depth, and is weighted according to the physical distance of the adjacent SNP [21]. Fisher’s exact tests were used to calculate P values at each variant position to generate a P value-based plot corresponding to the genomic position [22].
Kompetitive allele-specific PCR (KASP) is a fluorescence-based genotyping technology that enables the accurate detection of bi-allelic SNP and insertion–deletion (InDel) sequences at specific loci in complex genomic DNA samples [23]. It is potentially very useful for improving crop traits because of its high flux and strong operability[24]. In addition, BSA-seq may be combined with KASP marker-based fine mapping for the rapid and efficient identification of QTLs linked to target traits. This combined approach has been widely applied to identify QTLs for agronomic traits in diverse crops, including plant height in rice [25], powdery mildew resistance in melon [26], kernel length in wheat [27], and sucrose content in peanut [18].
In this study, a new method for calculating PST was proposed and then BSA-seq and fine mapping were used to identify QTLs for PST in an F2 population obtained from a cross between Yueyou 18 (YY18) and Weihua 8 (WH8). The results of this study may be used to further clarify the genetic mechanism underlying PST, while also providing the theoretical basis for developing relevant molecular markers and cloning important genes.