Development of mapping population
For inheritance analysis and constructing the mapping populations, the pink testa peanut cultivar Yuanza 9102 (YZ9102) was used as the female parent to cross with the red testa peanut cultivars Zhonghua 12 (ZH12) and Zhanhong 2 (ZHH2), respectively (Table 1). Two populations (YZZH12 and YZZH2) were developed through single-seed-descent method. The F1, F2 and F3 plants of two populations were planted in the field from 2017 to 2019 at Laixi Experimental Station (36°48'47.41" N,120°30'22.54" E), Shandong Peanut Research Institute (SPRI), Shandong, China. The F4 and F5 plants were planted in the field from 2020 to 2021 at Jiyang Experimental Station of Shandong Academy of Agricultural Sciences (SAAS), Shandong, China(36°58'34.53" N,116°59'1.29" E).
Measurement of anthocyanin
The anthocyanin content of peanut testa was measured using improved method as previous reports (Mancinelli et al., 1991, Teng et al., 2005, Zhao et al., 2020), using at least three sets of more than eight seeds per sample. In brief, frozen peanut testa (approximately 50 mg) was ground in a 5 mL centrifuge tube using liquid nitrogen. Then, homogenized testa was extracted at 4°C by adding 700 μl acidic methanol (the volume ratio of methanol to HCl is 99:1). After overnight incubation, the homogenates were centrifuged for 1 min at 12 000 rpm for 10 min. The supernatant (approximately 600 μl) was collected and mixed with 1 mL trichloromethane and 400 μl distilled water, and centrifuged at 4℃ at 12000 rpm for 10 min. Then, the absorbance of the supernatant was measured in spectrophotometer (U-3000, HITACHI, Japan) at 530 and 657 nm, respectively. The relative anthocyanin content was calculated according to the absorbance with the formula of [A530 − (1/4 × A657)] and then normalized by sample weight.
Whole genome sequencing and BSA-seq analysis
The genomic DNA was extracted from the leaves using the DNA Extraction Kit (DP305) of TIANGEN Biotech (Beijing,China) according to the manufacturer's instructions. DNA quality was determined using the BioPhotometer plus spectrophotometer (Eppendorf AG, Hamburg, Germany) and 1% agarose gel electrophoresis. For BSA-seq, two DNA pools were constructed by mixing equal amounts of DNA from 30 red testa F2:4 individuals (Red-pool) and 30 pink testa F2:4 individuals (Pink-pool). Then, DNA of YZ9102 and ZH12, and two DNA pools were sequenced on BGISEQ-500 platform at the Beijing Genomics Institute (BGI).
After sequencing, clean reads were obtained by removing low-quality and short reads using Soapnuke program (Chen et al., 2018), and mapped on reference genome of cultivated peanut Tifrunner (https://peanutbase.org/peanut_genome) using BWA software with the SAM tools (Li & Durbin, 2009). Single-nucleotide polymorphism (SNP) and Insertion/Deletion(InDel) were called, and filtrated by removing heterozygous and missing SNPs and InDels in the pools and parental lines using GATK software (McKenna et al., 2010). The SNP-index represents the ratio of reads harboring SNPs among the entire number of reads (Abe et al., 2012). To identify candidate regions associated with the red testa trait, the ΔSNP-index of each locus was calculated by subtracting the SNP-index of the Pink-pool from that of the Red-pool according to previous method (Takagi et al., 2013). To confirm the results of ΔSNP-index, Euclidean Distance (ED) algorithm was further preformed to identify the SNPs and InDels associated with the red testa trait using the equation reported previously (Lei et al., 2020, Hill et al., 2013).
Marker development, genetic map construction and mapping
To validate the BSA-seq results and further narrow down the region, 21 InDels in the candidate region were selected according to the comparative genomic information among the parents. Twenty one primer pairs were designed to the flanking sequences of the tragetted SNPs using Primer Premier 5.0 (http://www.premierbiosoft.com/primerdesign/). The polymorphism of these InDels was confirmed through polyacrylamide gel electrophoresis as described previously (Zhao et al., 2017). The Indel markers showing polymorphism among the parents were further used to genotype the F2:4 population. The sequences of primers used for mapping are listed in Supplementary Table S1.
Genetic linkage map was constructed using JoinMap 5.0 software (https://www.kyazma.nl/index.php/JoinMap/). The recombinant ratio was converted into genetic distances (centimorgans, cM) through the function of Kosambi map. The linkage groups were calculated at a minimum LOD score of 5. MapChart 2.3 software was used for drawing the linkage maps (Voorrips, 2002). For QTLs analysis, inclusive composite interval mapping of additive (ICIM-ADD) was performed using software QTL IciMapping V4.1 (Meng et al., 2015). The LOD threshold was calculated by 1,000 permutations at P < 0.05, and the LOD score was set at 2.5 to determine the presence of a putative QTL associated with a target trait.
Prediction of candidate genes
The sequences of gene information in the candidate interval were obtained according to the cultivated peanut reference genome sequences (Version 1, https://peanutbase.org). The functions of candidate genes were annotated through Blastx program in databases of Nr (NCBI, http://www.ncbi.nlm.nih.gov), GO (https://www.geneontology.org/), KOG (http://www.ncbi.nlm.nih.gov/KOG), and KEGG (http://www.genome.jp/kegg). For sequence alignment and phylogenetic analysis of the anthocyanidin reductase (ANR) genes, the amino acid sequences of ANR proteins were download from NCBI. Multiple sequence alignments were performed by clustalw (https://www.genome.jp/tools-bin/clustalw) and BoxShade online program (https://embnet.vital-it.ch/software/BOX_form.html). The phylogenetic analysis was produced by MEGA7 (https://www.megasoftware.net/) with the neighbor-joining statistical method. The GenBank accession numbers of these proteins are provided in Supplementary Table S2.
Quantitative Real-Time PCR (qRT-PCR)
Based on Functional annotation of genes, 14 genes were selected for qRT-PCR analysis. After harvest from field, the seed coat was collected immediately before drying. The samples for qRT-PCR analysis are the same as that for BSA-seq, including two parents and RNA pools, and three biological replicates were prepared for every sample. Total RNA was extracted using Trizol Reagent kit (TaKaRa, Inc., Dalian, China) according to the instructions of manufacturer. The reverse transcriptions were performed with PrimeScript II 1st Strand cDNA Synthesis Kit (TaKaRa, Inc., Dalian, China). The primers for qRT-PCR were designed using Primer Premier 5.0 (http://www.premierbiosoft.com/primerdesign/) and listed in Supplementary Table S1. The qRT-PCR reactions were performed on ABI7500 Real Time System (USA) as previous studies (Wang et al., 2017). The relative expressional levels of genes were calculated by 2-△△CT method.