Plant materials and phenotypic statistics
Two inbred lines, GX-71 and MY-1, were cultivated; GX-71 fruits are long cylindrical, with green skin and flesh, while those of MY-1 are round, with white skin and flesh. In this study, GX-71 and MY-1 were used to obtain a fourth-generation population for genetic analysis of fruit shape. F1 plants were obtained by crossing GX-71 (P1) and MY-1 (P2), and the F2 population was obtained by self-crossing F1 plants. The parents and F1 individuals were planted in the spring of 2020. In order to identify candidate genes for fruit shape, 276 and 6461 F2 individuals were planted in the spring and autumn of 2020, respectively. The within-row spacing was 0.5 m, with a distance of 1.2 m between rows. To ensure full fruit development, only one well-developed fruit was retained for 8–16 nodes of the plant. The longitudinal and transverse diameters of the fruits were recorded 15 days post pollination (dpp). From ovary formation to fruit ripening, the longitudinal and transverse diameters of fruits of the two parents were measured. Three replicates were used for each measurement, and the longitudinal diameter to transverse diameter ratio was calculated to determine the fruit shape index (FSI). All plant materials were grown in a field in Guangxi University (Guangxi, China).
In order to compare the cytological characteristics of fruits of the two parents, wax gourd flesh at days 0 and 15 post pollination was cut into thin slices and immediately placed in Formaldehyde Alcohol Acetic Acid fixative solution (50 % ethanol: 40 % formaldehyde: glacial acetic acid = 16:1:1). The volume of the fixed solution was approximately four times that of the thin slices of gourd. The bottle mouth was sealed with a sealing film and fixed at 25°C for more than 48 h. The fixed pulp was dehydrated step-by-step in different concentrations of ethanol (70 %, 80 %, 90 %, 95 %, and 100 %). The thin slices were embedded in paraffin using xylene. The vertical and horizontal sections of these slices were sliced using a paraffin sectioning machine and pasted on a slide. The thin slices were stained using the safranin solid green double staining method and observed with a Z2 automatic upright differential interference fluorescence microscope (Zeiss, Germany).
The cell sizes and cell numbers of parental lines were estimated using images obtained using the Image J software (https://imagej.nih.gov/ij/).
The genomic DNA of the parent plants, and of the F1 and F2 populations, were extracted from leaf material using the cetyltrimethylammonium bromide (CTAB) method (Porebski et al. 1997). The concentration and purity of the extracted DNA were measured using a k5800 ultra-micro spectrophotometer (Kaiao, Beijing, China), and the DNA was evaluated by 1.2 % agarose gel electrophoresis.
After the samples were tested to be qualified, the DNA was randomly interrupted using ultrasonic fragmentation. The DNA fragments were repaired at the end, with A added at the 3′ end and sequencing connector added; they were purified and then amplified by PCR to construct a sequencing library. After passing the quality inspection, the library was sequenced using the Illumina sequencing platform. The original image data files obtained through high-throughput sequencing were transformed into sequenced reads by base calling analysis. The sequenced reads contained low-quality reads with connectors. Raw reads were filtered to obtain clean reads to ensure the quality of subsequent information analyses. The main data filtering steps were as follows: (1) the sequence of the adapter was removed; (2) if the proportion of N on a read was more than 10 %, then paired reads were filtered out; and (3) low-quality reads (the number of bases with Q ≤ 10 accounting for more than 50 % of the whole read) were removed. The sequencing readings were compared with the reference genome using BWA software, and then re-located to the reference genome for subsequent mutation analysis (Li and Durbin 2009).
BSA-sep mapping approach
To examine phenotype, 60 extreme plants (30 plants with long cylindrical fruits and 30 plants with round fruits) were selected from the 276 F2 plants. After single plant resequencing, two mixed pools, one long cylindrical pool and one round pool, were constructed. The two mixed pools and two parent pools were used for association analyses, and the reference genome was that of GX-71. SNP detection was performed using the GATK software kit (McKenna et al., 2010). Based on the results of the positioning of clean reads in the reference genome, GATK was used for local realization and other pretreatments to ensure the accuracy of SNP detection and for SNP detection to determine the SNP site set. Before association analyses were undertaken, SNPs were first filtered. The filtering criteria were as follows: first, SNPs with multiple genotypes were filtered out; second, SNPs with read support of less than 4 were filtered out; third, SNPs with consistent genotypes among pools and those with recessive pool genes not received from recessive parents were filtered out. Then, the Euclidean distance (ED) algorithm was used to analyze the association of SNPs with different genotypes between the two pools (Hill et al. 2013; Rym et al. 2013). In this analysis, SNP sites with different genotypes between the two pools were used to determine the depth of each base in the different pools and to calculate the ED value of each site. In order to eliminate background noise, the original ED value was processed by power, and the second power of the original ED was taken as the correlation value, and the DISTANCE method was used to fit the ED value.
Based on the BSA-Sep data, and the distribution and density of the physical locations of SNPs, KASP markers were developed at 1–3-Mb intervals in the candidate region. Preparation of the mixture for analysis and PCR amplification was performed according to the manufacturer’s instructions (LGC Genomics, Shanghai, China). The PCR reaction system occupied a volume of 10 µL, including 4.78 µL of DNA (5–50 ng µL− 1), 5 µL of KASP master mix (LGC Group, Teddington, Middx, UK), 0.14 µL of KASP assay mix, and 0.08 µL of Mg+. PCR amplification was performed using landing PCR. The reaction conditions were as follows: heat treatment at 95 ℃ for 15 min; denaturation at 95 ℃ for 20 s, annealing and extension between 65–55 ℃ for 25 s, 10 landing cycles (each cycle reduced by 1.0 ℃); denaturation at 95 ℃ for 10 s, annealing and extension at 57 ℃ for 1 min, 30 cycles; followed by preservation in dark conditions at 4 ℃. After amplification, fluorescence scanning and genotyping were performed. A total of 1469 F2 individuals were used for genotype-phenotype analyses.
To further narrow down the mapping range, we used flanking markers to genotype the F2 population, which was made up of 4992 individuals, for the identification of recombinants. New KASP markers were simultaneously developed among the flanking markers to detect the genotypes of the recombinant plants, and the most likely target gene region was inferred using genotype-phenotype joint analysis.
Cloning and sequencing analysis of candidate genes
Genes and whole coding sequences (CDSs) of candidate genes were cloned from gourd flesh. The primers (Table S1) were designed based on the genome. Total RNA was isolated using a plant RNA purification kit (Tiangen, Beijing, China), according to the manufacturer’s instructions, and treated with RNase-free DNase solution to remove residual genomic DNA. The first strand of complementary DNA (cDNA) was synthesized using the enzyme reverse transcriptase, and primers for homologous cloning were designed using the CDS database of the research group. The 2 × A8 FastHiFi PCR Master Mix (Aidlab, Beijing, China) was used for PCR amplification. The PCR product was detected by 1.2 % agarose gel electrophoresis, and the target strip was recovered and purified by gel cutting. Then, a zero-background pTOPO-Blunt cloning kit (CV16) from Aidlab was used to construct an expression vector, according to the manufacturer’s instructions; 1 µL of pTOPO-Blunt vector, 1 µL of 10× Enhancer, 6 µL of sterile water, and 2 µL of PCR gel products were mixed, and ligated at 37°C for 10 min. The vector was transformed into DH5α-chemically compatible cells (E. coli DH5α) according to the manufacturer’s instructions (Aidlab, Beijing, China), and the correct PCR colony clones were selected for sequencing confirmation. All fragments were sequenced by Shanghai Shengong Biotechnology Co., Ltd. DNAMAN (Shanghai, China) was used for multiple sequence alignments.
Gene expression analysis
To analyze gene expression, RNA was extracted from the ovary and flesh at different developmental stages, and from peels, roots, stems, leaves, and female and male flowers at the flowering stage. cDNA was synthesized using reverse transcriptase RT Master Mix (RR036A) following the manufacturer’s instructions (TaKaRa, Beijing, China). The gene-specific primers (Table S1) of the candidate gene and the reference gene, Actin, were designed using NCBI online Primer-BLAST (https://www.ncbi.nlm.nih.gov/tools/primer-blast/). QuantStudio 6 Flex (Thermo Fisher, MA, USA) was used to evaluate the expression levels of target genes by qRT-PCR. A SYBR Green real-time PCR mixture was used for all reactions. A 20-µL RT–PCR reaction mixture containing 2 µL of cDNA, 1 µL of forward primers (10 µM), 1 µL of reverse primers (10 µM), 10 µL of 2× SYBR Green real-time PCR mixture, and 6 µL of nuclease-free water was preheated at 95 ℃ for 30 s, followed by heating for 5 s at 95 ℃ and 34 s at 60 ℃ for 40 cycles. High-resolution melting was performed at 95 ℃ for 15 s, 60 ℃ for 1 min, and 95 ℃ for 15 s. At least three replicates were tested for each sample. The original data from RT–PCR were obtained using QuantStudio 6 Flex software (Thermo Fisher, MA, USA), and the relative expression was determined by the 2−∆∆CT method with actin as the internal control.