Genome-wide detection of CNVs and CNVRs
Sequencing was performed on an Illumina HiSeq 4000 platform, producing high-quality NGS data for 32 fine-wool sheep (Additional file 1: Table S1). These reads were aligned to the sheep reference genome (Additional file 2: Table S2), with the coverage depth of each individual ranging from 28.08× (M373370) to 40.21× (M373981). This indicated that the sequencing depth was sufficient and CNV detection was possible.
CNVnator software, which is based on the read depth method, was utilized, and a total of 1,747,604 CNV events (including 49,851 “duplication” events and 1,697,753 “deletion” events) were detected in the 32 fine-wool sheep, with each sheep’s genome possessing 54,612.63 CNVs, on average (Table 1, Additional file 3: Table S3). To explore the CNV distribution pattern in the four groups of fine-wool sheep, violin plots were drawn for the CNV lengths. CNV lengths showed slight differences between the groups, but the total sum of CNVs from CMS_horn sheep varied widely within this population (Fig 1). The identified CNVs ranged from 0.20 kb to 5,023.60 kb in length, with an average length of 4.30 kb. The distribution showed that 69.44% of the CNVs were located within the 0–2 kb interval, 19.49% were within 2–4 kb, and 11.07% were greater than 4 kb in length (Fig 2A).
After overlapping CNVs were merged, a total of 7,228 CNVRs were obtained, with AMS_no possessing 5,233, AMS_horn possessing 5,297, CMS_horn possessing 5,394, and AHS_no possessing 5,441 (Additional file 4: Table S4, Table 1). A total of 3,783 CNVRs were shared by the AMS_no, AMS_horn, CMS_horn and AHS_no sheep (Additional file 5: Fig S1). The average length of these CNVRs was 2.62 kb, including 6,345 “deletion” events, 861 “duplication” events and 22 “both” events, and the chromosome length had a significant positive linear relationship with the number of CNVRs (R2=0.87, Additional file 4: Table S4, Fig 3). In addition, these CNVRs were nonuniformly distributed across the sheep chromosomes, with the maximum length found in Ovis aries chromosome one (OAR1), and the minimum found in OAR26 (Additional file 6: Fig S2). The distribution showed that 67.35% of the CNVRs were located within the 0–2 kb interval, 18.34% were within 2–4 kb, and 14.31% were greater than 4 kb in length (Fig 2B).
Comparison with other studies on CNVs in sheep
The results of this study were compared with six previous reports on sheep CNVRs (Table 2). Between 111 and 3,488 CNVRs have been detected in sheep in previous studies, with CNVR lengths of 10.56–120.53 Mb being reported. Between 17 and 424 of the CNVRs detected in this study overlapped with previously reported CNVRs, with overlapping ratios of 4.39%–55.46%.
Functional annotation of the identified CNVRs
To further investigate the function of these CNVRs, functional enrichment analysis of the CNVR-harboring genes was performed. A total of 119 GO terms were enriched in the CNVRs shared by the four groups of fine-wool sheep (p<0.05), with these including 48 biological processes, five cellular components and 66 molecular functions (Additional file 7: Table S5). These GO terms involved sensory perception systems (GO:0007605, GO:0050954 and GO:0007600), metabolic processes (GO:0006508, GO:0043112 and GO:0055070) and growth and development processes (GO:0048610, GO:0000003 and GO:0007423), among others. According to the KEGG pathway analysis, the shared CNVR-harboring genes were enriched in 18 pathways (p<0.05, Additional file 8: Table S6), including the Jak-STAT signaling pathway (oas04630), the Rap1 signaling pathway (oas04015), the calcium signaling pathway (oas04020), the Hippo signaling pathway (oas04390), and the estrogen signaling pathway (oas04915). Furthermore, functional enrichment analysis of the specific CNVR-harboring genes in the four groups of fine-wool sheep was also performed, and it was found that a large number of the CNVR-harboring genes participated in fat metabolism (GO:0006635, GO:0009062 and GO:0034440), amino acid metabolism (GO:0006658, GO:0006659 and GO:0005234), microelement metabolism (GO:0005506, GO:0010167 and GO:0006766), and response to stimuli (GO:0032102, GO:0032104 and GO:0009733 ), among other processes (Additional file 7: Table S5, Additional file 8: Table S6).
QTLs overlapping with identified CNVRs
CNVRs detected in the four groups of fine-wool sheep were compared with a database of previously reported sheep QTLs to further analyze their hereditary effects. It was found that 1,855 of the CNVRs were associated with 166 QTLs, with the QTL frequency ranging from 1 to 500. These QTLs included milk, carcass and health-related QTLs, among others, providing important information for improving fine-wool sheep in the future (Additional file 9: Table S7).
Population genetics of CNVRs
The 32 fine-wool sheep were divided into horned and polled groups, and selective sweep analysis of all the CNVRs was performed. As can be seen in Fig 4 and Table S8 (Additional file 10), the horned and polled fine-wool sheep showed genetic differentiation in many of their chromosomes, with the most significant variation on chromosome 10, in the RXFP2 and B3GLCT gene. Further analysis revealed that this locus contains three CNVs (10:29558601-29559800, 10:29592601-29593700, and 10:29603501-29605100), all of which belong to the “deletion” type. The CNVRs with the top five VST values were selected as candidate CNVRs, and the functional enrichment analysis of the genes annotated by these CNVRs was carried out. A total of 77 GO terms were found to be enriched (Additional file 11: Table S9), and they were mainly associated with fat metabolism and responses to stress. In addition, seven KEGG pathways were enriched (Additional file 12: Table S10), including olfactory transduction, the Notch signaling pathway, and the renin-angiotensin system, among others.
qPCR validation of CNVRs
To confirm the accuracy of our CNVR predictions, we randomly selected 10 CNVRs in 12 sheep samples to validate via qPCR. As shown in Fig S3 (Additional file 13), eight (80%) of the randomly selected CNVRs were confirmed in agreement using CNVnator software.