3.1. Phenotypic data analysis
After quality control of the phenotype data of the F2 generation group with the Liao method[27], a total of 380 rabbits in the F2 generation (205 male rabbits and 175 female rabbits) were obtained for genotyping and data analysis. Descriptive statistics were conducted for the phenotype values and heterosis of 380 valid samples at seven growth stages. The statistics included the number of valid samples (N), maximum value (Max), minimum value (Min), mean value (M), and standard deviation (SD). The average weights at the age of 35, 42, 49, 56, 63, 70, and 84 days were 786.31 g, 1011.42 g, 1243.56 g, 1473.96 g, 1706.34 g, 1943.69 g, and 2136.16 g, respectively (Table 1). The average heterosis for each group were − 0.006 g, -0.012 g, -0.119 g, -0.055 g, 0.016 g, -0.011 g, and − 0.017 g(Table 2). In addition, the average daily weight gains from 35 to 42, 42 to 49, 49 to 56, 56 to 63, 63 to 70, and 70 to 84 days were 31.91 ± 10.63g, 32.99 ± 9.21 g, 32.91 ± 9.35 g, 32.91 ± 9.35 g, 33.02 ± 9.56 g, and 12.06 ± 23.95 g, respectively (Table 3). The average heterosis were 0.001 ± 0.39 g, -0.37 ± 0.28 g, -0.39 ± 4.23 g, 0.78 ± 2.16 g, 0.14 ± 0.59 g, and − 0.13 ± 2.32 g (Table 3). As shown in Fig. 1, the weights of the 380 valid samples in the F2 generation followed normal distribution at all seven growth stages.
3.2. Genotyping and population stratification
The number of SNPs on each chromosome changed significantly before and after QC, as shown in Fig. 2A. After screening and imputation of genotype data according to QC standards, a total of 78,579 SNPs were obtained on the autosomes of 380 samples (205 male rabbits and 175 female rabbits). The distribution of SNPs on the 21 autosomes is shown in Fig. 2B. The effective number (N), average distance (AD), and minor allele frequency (MAF) of SNPs on each autosome were also counted (Table 4).
The presence of population stratification may lead to changes in allele frequencies, resulting in subpopulations. This change may affect the accuracy of the GWAS, leading to false positive results[10]. Based on the effective SNPs obtained after quality control, principal component analysis was performed using PLINK v1.9[28] (Fig. 2C), and the results were analyzed using PCA for R v. 4.2.1 (https://www.r-project.org/). The results indicate no apparent stratification among samples.
3.3. Correlation analysis of growth traits
The experiment used the MLM model to perform a GWAS of phenotypes and quality-controlled 78,579 SNPs in 380 samples. A total of 10 SNPs reaching a significant level of FDR (P < 0.1) were detected. These significant SNPs were all significantly associated with body weight at 84 days and were distributed on chromosomes 1, 5, 7, and 12. Annotation within 1 Mb upstream and downstream of all significant SNPs identified a total of 99 candidate genes (Supplementary Table 1), including 85 protein-coding genes and 14 long non-coding RNAs. Through the localization of candidate genes (Fig. 3), ABTB2, CDH11, CNTNAP5, PPP1R3A, JARID2, GPX5, and GPX6 were discovered. Ultimately, we believe that these 7 genes may be candidate genes affecting the heterosis of meat rabbit growth traits.
Functional analysis of candidate genes found no relevant functional descriptions for the 14 long non-coding RNAs. Among the 85 protein-coding genes, 21 are novel genes (Supplementary Table 1). Enrichment analysis identified 22 GO entries related to biological processes and 6 KEGG pathways (Supplementary Table 2、Supplementary Table 3). Among them, 18 GO entries and 5 KEGG pathways reached significant levels (corrected p < 0.05 was considered significantly enriched). Both candidate genes CAPRIN1 and JARID2 were involved in the most significant and highly enriched GO entry: gene expression. In addition, candidate gene JARID2 is involved in 9 biological processes, such as cellular macromolecule biosynthetic process and gene expression process.
3.4. ROH and correlation of inbreeding coefficients
In the F2 generation, a total of 42018 ROHs were detected from 380 samples. The quantity of ROH is highly correlated with its length, ranging from 0.9757 to 0.9837 (Fig. 4A). Based on the length of ROH, we classified ROH into 3 categories: 0.3 ~ 2 Mb, 2 ~ 4 Mb, and > 4 Mb[35, 36]. The shorter ROH segments (0.3 ~ 2 Mb) account for a high proportion (Fig. 4B), with an average length of 0.93Mb. The ROH on each chromosome were visualized (Fig. 4C and 4D) and the number, average length, and inbreeding coefficient of the ROH were calculated (Table 5).
We used the Logistic, Gompertz, Brody, and Von Bertalanffy four meat rabbit fitting curve models established by Liao[27], all of which successfully fit the growth curve of the F2 generation sample. It was found that the Logistic model had the best fitting effect, with a high model fitness of 0.996. Therefore, in this experiment, we fitted the growth curve of the population based on the Logistic model. The fitting results of the growth curve indicate that the inflection point of growth is 49 days old. Hence, the analysis of ROH and FROH at 49 days old shows that the range of correlation with heterosis is -0.1267 to 0.07436 for ROH and 0.8754 to 0.9150 for ROH and FROH.
3.5. Functional annotation of candidate genes in ROH.
Perform frequency statistics on SNPs of ROH, the top 5% high-frequency SNPs were defined as high-frequency SNPs, and the ROH regions formed by them were defined as high-frequency ROH regions (a total of 1924 were detected). Annotation of SNPs in high-frequency regions identified 1018 candidate genes, including 748 protein-coding genes and 270 long non-coding RNAs (Supplementary Table 4). By annotating the genes in the high frequency region and using the whole genome SNPs marker information, 20 genes related to economic traits were identified. (Table 6). GO functional and KEGG signaling pathway enrichment analysis was performed on the selected candidate genes. The results indicated 28 GO entries, including 9 biological processes, 8 cellular components, and 11 molecular functions closely related, as shown in Fig. 5A. The most important biological processes were catalytic activity, magnesium ion binding, and ubiquitin-protein transferase activity, particularly those related to RNA binding. Additionally, eight significant KEGG signaling pathways are shown in Fig. 5B. MAP2K6 and MEP1B are involved in the signaling pathways of Salmonella infection and protein digestion and absorption, respectively(Supplementary Table 5).