In this study, NGS technology was used to detect the CNVs in 32 indigenous fine-wool sheep in China. A total of 1747604 CNV events were detected, with each sheep, on average, possessing 54612.63 CNVs. In comparison with previous CNV detection methods based on SNP chips and aCGH, NGS has many advantages for the determination of both the number and size of CNVs [7, 13]. With its high sensitivity for CNV detection, NGS can identify CNV boundaries more accurately [14]. A total of 7228 CNVRs were obtained after merging overlapping CNVs, which greatly exceeded the numbers previously reported for sheep based on SNP50 chip and SNP600 chip studies [11, 13, 15, 16]. This difference was not surprising, as the genomic coverage of SNP chips is poor, which results in the detection of longer CNVRs [17, 18]. The CNVRs detected in this study accounted for 2.17% of the sheep reference genome, which falls within the range (0.8%–5.12%) reported for horses, pigs, cattle and chickens [19–22]. However, the CNVRs identified in individual species accounted for more than 10% of their reference genomes, which may be related to the different genetic backgrounds of the studied animals [23, 24]. Studies have shown that the number of CNVRs detected in populations consisting of a variety of species may be higher than the numbers detected in populations only containing a single species [18]. In addition, these results could also be ascribed to differences in the CNV calling algorithms and standards used to determine the CNVs [25, 26]. Therefore, further development of bioinformatics algorithms and tools to generate high reliability CNVs is necessary for improving the quality of CNV studies. In the CNVs identified in this study, “deletion” events were far more frequent than “duplication” events, which concurred with the similar disequilibrium phenomenon found in studies of other species [8, 27]. This may be because of the higher sensitivity of CNV calling algorithms to deletion events, as it is easier to identify a missing segment of the genome than an amplified one when there are limited numbers of sequence reads [20].
Keeping in mind that the detection rate of CNVRs is affected by many factors, the results of this study were compared with those of six previous studies on sheep CNVs. The CNVRs identified in these previous studies were different to some extent, which may have been related to the differences in sheep breeds, sample sizes, CNV detection platforms and CNV calling algorithms used. However, it is noteworthy that the CNVRs identified in this study had high overlapping ratios (27.93%–55.46%) with the CNVRs identified by Liu et al., Ma et al., Zhu et al., and Ma et al., but had low overlapping ratios (4.39%–12.59%) with the CNVRs detected by Fontanesi et al., and Jenkins et al., [7, 11, 13, 15, 16, 28]. The four studies with which there were high overlapping ratios all used Chinese indigenous sheep breeds or Chinese cultivated sheep breeds as the study subjects, whereas the two studies with which there were low overlapping ratios used foreign sheep breeds. It was also noted that when comparing to studies using the Illumina OvineSNP BeadChip to detect sheep CNVs, the number of CNVRs overlapping with those identified in this study tended to increase as the number of probes on the chip increased from SNP50 to SNP600. The use of different CNV calling algorithms also has a substantial effect on the results of CNVR studies. The software packages currently commonly used for CNV detection include PennCNV, CNVcaller, and CNVnator. PennCNV software has been extensively applied to Illumina chip data, especially for high-density SNP data [16, 29]. CNVcaller and CNVnator software use read depth methods to detect CNVs in resequencing data [30, 31]. Each software package has its own advantages and disadvantages, which may impact the accuracy of CNV detection.
In this study, many of the CNVR-harboring genes were significantly enriched for GO terms (GO:0007605, GO:0050954 and GO:0007600) relating to sensory perception. This concurred with the results of a study on the CNVs in humans, yak, pigs, horses, dogs and mice, which also found that GO terms relating to sensory perception were significantly enriched [32–37]. A previous study also found that, in comparison with cattle, gene families related to sensory perception were significantly enriched in yak [38]. Yak generally live in alpine pastoral areas which have serious shortages of fodder grasses in spring and winter, and a well-developed sensory perception system could improve their ability to acquire food. The three fine-wool sheep breeds used in this study are mainly farmed in extensive grazing systems, and their sensory perception-related gene families may have therefore rapidly expanded to adapt to the environment and its shortages of fodder grasses, and alpine and drought environmental pressures. Many GO terms (GO:0006508, GO:0006635 and GO:0016055) related to substance metabolism were also enriched, and these GO terms were also related to the environment in which the fine-wool sheep selected for this study were located. Fine-wool sheep live in an extremely harsh environment, so substance metabolism mechanisms are of great importance for their production and reproduction. In addition, Wnt-related signaling pathways (GO: 0030178 and GO: 0016055) were also enriched in some of the CNVR-harboring genes in the AMS_no group. Studies in humans and mice have shown that Wnt signaling plays a crucial role in hair follicle development and hair growth during the transition from the resting period to the growth period [39, 40]. The three sheep breeds selected in this study were mainly used for wool production, and the wool quality of AMS was superior to that of CMS and AHS [41–43]. Therefore, the Wnt signaling pathway may make an important contribution to the hair follicle development process in AMS.
Through the analysis of KEGG signaling pathways, it was found that some of the CNVR-harboring genes were enriched for signaling pathways correlated with wool growth and development. It has been reported that, as one of the important pathways in the follicle development process, the Jak-STAT signaling pathway can stimulate MAPK to influence follicle development [44]. The skin is the largest non-genital organ targeted by estrogens, which can significantly change the cyclic response of the hair follicles. Estrogens can lengthen the hair growing period and shorten the rest period, thereby promoting rapid hair regeneration [45, 46]. In addition, some signaling pathways related to microelement and vitamin metabolism were also enriched. A shortage of microelements and vitamins can influence wool growth by influencing follicle development [47].
Many studies have shown that CNVRs contain QTLs associated with important economic traits in animals [48, 49]. Therefore, the CNVRs detected in this study were compared with the QTLs reported in the sheep QTL database. The QTL categories found in this study were basically identical to those found in pigs and cattle. The health-related QTLs found included fecal egg count QTLs, worm count QTLs and worm length QTLs. Previous studies have reported that worm disease infection rates in sheep can exceed 70% in many countries, causing huge losses to the livestock industry [50, 51]. Relative to barn-fed livestock, gazing livestock are more likely to be infected with worms. These results indicate that CNVs, which are a critical type of genetic variation, may have an important effect on sheep health.
To investigate the genetic role of CNVs in the horn type domestication process of fine-wool sheep, the 32 sheep were divided into horned and polled groups for the CNVR selective elimination analysis. The RXFP2 gene was found to be intensely selected between the two groups. Many previous studies have confirmed that RXFP2 is the main candidate gene related to sheep horn type [52–55]. Some genes associated with physical features in sheep are artificially selected in a directional manner during the domestication process. CNVs may therefore accumulate in sheep populations under these selection pressures, thereby forming the genetic basis for important economic characteristics.
Conclusions
In this study, the first resequencing-based CNV map of Chinese indigenous fine-wool sheep was developed, providing an important addition to the previously published sheep CNVs. This information will be beneficial for future investigations of the genomic structural variations underlying traits of interest in sheep.