DOI: https://doi.org/10.21203/rs.3.rs-1692291/v1
Background Carcass backfat thickness (BFT), carcass lean percentage (CLP) and carcass fat percentage (CFP) are important to the commercial pig industry. Identification of the candidate SNPs and genes associated with these traits will be beneficial to pig breeding. We performed a GWAS based on SLAF-seq to analyze seven traits, including five BFTs, CLP, and CFP on 223 four-way intercross pigs.
Results A total of 1,552,377 SLAFs from the pig reference genome (Sscrofa 11.1_102) were predicted using RsaI and HaeIII enzyme combination. With an average sequencing depth of 11.94-fold, a total of 10,784,484 SNPs were obtained. After SNPs’ filtering, 227, 921 SNPs were identified to perform GWAS. Using the mixed linear model (MLM) model, a total of 21 SNP loci significantly related to these traits were identified on ten chromosomes. Interestingly, nine shared traits SNPs were detected. Among these, SSC7:29503670 for average backfat thickness (ABFT, P=3.26E-08) and SSC7:29486003 for ABFT (P=2.12E-07) exceed 1% and 10% genome-wide significance, respectively. The two loci were located in the intron of COL21A1, which was highlighted as a strong candidate gene for porcine BFT traits. On SSC9, three adjacent significant SNPs were detected within a region of 0.19Mb (between 87.44Mb and 87.63Mb). PGM2L1 was considered a candidate gene for the region. Finally, a total of 13 genes, including ZNF184, ZNF391, COL21A1, OR4P4L, OR4C13L, OR5D14, OR5D13L, OR1J4L, HMGA1, PGM2L1, FAM171A1, PLBD2 and KIAA121 associated with carcass backfat thickness, and/or carcass lean percentage, and/or carcass fat percentage were considered potential candidate genes for further study.
Conclusions The results of this GWAS will greatly advance our understanding of the genetic architecture of BFT, CLP, and CFP traits. The identified SNPs and candidate genes were beneficial to pig breeding.
In the study, carcass backfat thickness (BFT) and carcass lean percentage (CLP) and carcass fat percentage (CFP) are complex quantitative traits and the most important economic factors in the pig industry. The excessive deposition of fat in pigs leads to low feed efficiency and is not favored by consumers. It is well known that CLP is strongly negatively correlated with BFT [1, 2]. In the past few years, researchers have worked on increasing carcass lean percentage and decreasing backfat thickness by advanced molecular breeding methods to improve production efficiency. To date, 3,402 QTLs and 265 QTLs are associated with fatness traits and CLP, respectively, which have been accumulated in the pig QTL database (http://www.animalgenome.org/cgi-bin/QTLdb/index, Dec 27, 2021).
Domestic pigs show great phenotypic diversity, which is attributed to about 10000 years of natural and artificial selection [3]. At present, Chinese indigenous pigs have different phenotypes compared with Western commercial pigs. Western commercial breeds, such as Large White, Landrace, and Duroc with fast growth and a high lean percentage are widely distributed all over the world. Conversely, Chinese indigenous breeds, such as the Saba pig, are fat-type pig breeds without intensive manual selection. The native Saba pig is well known for its high prolificacy, superior meat quality, and strong resistance to the harsh environment, is widely distributed in Yunnan Province, China. However, the shared disadvantage of the indigenous breeds containing Saba pig is the excessive deposition of fat resulting in a slow growth rate and low feed conversion rate. In general, Chinese native pigs have a lean percentage of less than 45%, which are very different from a lean percentage greater than 60% in Western commercial pigs [4]. Using Chinese and Western pig breeds as parents, the hybrid offspring show different extreme phenotypes in fatness and more genetic variation.
Currently, remarkable advances in fatness-related traits have been made, and many related QTLs and genes have been reported [5–12]. However, identifying exact quantitative trait locus locations and new candidate genes is still a challenge. For complex traits, such as BFT, CLP and CFP, large-scale analysis is necessary to detect trait-associated SNPs. Genome-wide association study (GWAS) [13] represents a powerful approach to correlating SNPs and functional genes with quantitative traits, and has been widely applied in important economic traits of pigs, including carcass [14–19], meat quality [14, 15, 19, 20], growth [15, 21], immunity [22], and reproductive traits [23]. Specific-locus amplified fragment sequencing (SLAF-seq) is an efficient solution for large-scale genotyping. This technique has several distinguishing characteristics: deep sequencing, reduced sequencing costs, optimized marker efficiency, and applicability to large populations [24]. GWAS based on SLAF-seq is successfully used to identify SNPs associated with important economic traits in chickens, ducks, geese, and rabbits [25–30]. Genotyping and genetic structure analysis are also successfully applied in pigs using the SLAF-seq genotyping method [31, 32].
Here, we examined 223 four-way crossbreds with Landrace, Yorkshire, Duroc and Saba pigs as the hybrid parents (Saba pigs as the hybrid females) raised under the same environmental conditions for BFT, CLP and CFP traits. Subsequently, SLAF-seq was employed to perform GWAS and recover potential alleles controlling these traits. To our knowledge, this was the first SLAF-seq-based GWAS to identify SNP loci and candidate genes linked to porcine economic traits. The results provided a basis for the molecular marker (related to fatness traits)-assisted breeding and improvement of the fatness traits in pigs.
Table 1 summarized the statistical information on the seven traits. The mean values for SBFT, LRBFT, LBFT, ABFT, 67RBFT, CLP, and CFP were 4.4609cm, 2.8903cm, 2.8371cm, 3.399cm, 3.6014cm, 54.4451% and 27.1509%, respectively. The traits distribution basically conformed to the normal distribution (Fig. S1). The phenotypic correlation coefficients for the seven traits were shown in Fig. S2. Significantly positive correlations were found among five BFT traits (r > 0, p < 0.001), between five BFT traits and CFP (r > 0, p < 0.001). Five BFT traits and CFP were significantly negatively correlated with CLP (r < 0, p < 0.001). Furthermore, CLP showed the strongest negatively correlated with CFR (r=-0.86, p < 0.001), while ABFT showed the strongest positive correlations with SBFT, LRBFT and LBFT (r = 0.85, p < 0.001).
Traits | N1 | Min2 | Max3 | Mean | SD4 | CV5 |
---|---|---|---|---|---|---|
Backfat thickness at the shoulder, SBFT (cm) | 223 | 2.000 | 7.334 | 4.4609 | 0.9780 | 21.9442 |
Backfat thickness at last rib, LRBFT (cm) | 223 | 1.188 | 5.540 | 2.8903 | 0.7842 | 27.1320 |
Backfat thickness at last lumbar, LBFT (cm) | 223 | 0.974 | 6.232 | 2.8371 | 0.8176 | 28.8182 |
Average backfat thickness, ABFT (cm) | 223 | 1.827 | 6.248 | 3.3990 | 0.7334 | 21.5764 |
Backfat thickness at 6–7 ribs, 67RBFT (cm) | 223 | 0.986 | 5.652 | 3.6014 | 0.8214 | 22.8071 |
Carcass lean percentage, CLP (%) | 223 | 40.14 | 70.55 | 54.4451 | 4.4748 | 8.2189 |
Carcass fat percentage, CFP (%) | 223 | 10.97 | 39.37 | 27.1509 | 5.1876 | 19.1065 |
1Number of samples 2Minimum 3Maximum 4Stand deviation 5Coefficient of variation |
According to the selection principle of the enzyme digestion scheme, two restriction enzymes, RsaI and HaeIII, were selected as enzyme combinations for developing SLAF tags, and the sequence with the length of 314–344 bp was defined as SLAF tags. As shown in Table S1, a total of 1109.92 million reads were obtained from all individuals. Average Q30 and GC content were 90.74% and 44.83%, respectively. Similar to the number of expected SLAFs, a total of 1,552,377 SLAF tags (average 331,608 SLAFs) were identified across the whole genome, with sequencing to an 11.94 average depth. A total of 16,997 polymorphic SLAF tags were identified. In addition, Oryza sativa indica was used as a control during sequencing. The results showed that the percentage of digestion normally and paired-end mapped reads of control were 90.77% and 95.4%, respectively (Table S2), indicating that the SLAF-seq process was normal and available. The density distribution of SLAFs was calculated throughout the pig genome and was shown in Fig. 1A.
After genomic mapping and SNP calling, a total of 10,784,484 SNPs were discovered using all individuals (Table S1). A series of quality control filtering of SNPs was performed to identify 227,921 SNPs used in the further analysis based on the selection criteria (integrity > 0.8; MAF > 0.05). The density distribution of SNPs was calculated throughout the pig genome and was shown in Fig. 1B. Almost all of the genome’s non-overlapped 1 Mb region contained SNPs, which indicated that the data was reliable.
As population stratification might affect GWAS, quantile-quantile (Q-Q) plots of all traits were drawn. The observed P-value calculated by the association study fit the expected ones, which suggested that the population stratification was well-corrected and the association analysis using MLM was reliable. The Q-Q plot of each trait was shown following the Manhattan plot of the corresponding trait (Fig. 2, Fig. 3). A total of 21 SNPs were identified as significant (P ≤ 1.0E-05) for the traits investigated (Table 2). Only one SNP (SSC7:29503670 for ABFT) exceeded the 1% genome-wide significance level (P = 3.26E-08). Another SNP near this SNP (SSC7:29486003 for ABFT) exceeded the 10% genome-wide significance level (P = 2.21E-07). Among the detected SNPs, four, three, two, seven, five, six and four SNPs, were significantly associated with SBFT, LRBFT, LBFT, ABFT, 67RBFT, CLP and CFP traits, respectively. For BFT traits, SNPs were distributed in SSC3 (SSC for Sus scrofa chromosome), SSC7, SSC8, SSC9, SSC12, SSC13 and SSC15. For CLP and CFP traits, SNPs were distributed in SSC9, SSC10, SSC11 and SSC14. In this study, 33 genes located within 100 kb upstream and downstream of these significant SNPs were considered potential candidate genes (Table 2).
Trait1 | SNP2 | SSC3 | Pos (bp)4 | MAF5 | P-value6 | -log10P7 | Allele | PEV (%)8 | Genes9 | Distance10 |
---|---|---|---|---|---|---|---|---|---|---|
SBFT | SSC7:21392136 | 7 | 21392136 | 0.11 | 2.31E-06 | 5.64 | T/C | 11.182 | POM121L2 | U:50,008 |
ZNF391 | Intron | |||||||||
ZNF184 | D:21,214 | |||||||||
SSC7:29486003 | 7 | 29486003 | 0.12 | 5.29E-06 | 5.28 | T/C | 10.845 | COL21A1 | Intron | |
SSC7:29503670 | 7 | 29503670 | 0.14 | 3.94E-06 | 5.4 | T/C | 10.549 | COL21A1 | Intron | |
SSC15:117022248 | 15 | 117022248 | 0.2 | 3.42E-06 | 5.47 | T/C | 8.982 | BARD1 | D:25,591 | |
ENSSSCG00000048197 | U:30,544 | |||||||||
LRBFT | SSC7:21466553 | 7 | 21466553 | 0.12 | 1.11E-06 | 5.96 | C/A | 15.435 | ZNF391 | U:66,090 |
ZNF184 | U:31,328 | |||||||||
SSC7:29503670 | 7 | 29503670 | 0.14 | 8.98E-06 | 5.05 | T/C | 12.749 | COL21A1 | Intron | |
SSC3:34308396 | 3 | 34308396 | 0.06 | 2.80E-06 | 5.55 | G/A | 8.402 | NA | NA | |
LBFT | SSC7:29503670 | 7 | 29503670 | 0.14 | 1.02E-06 | 5.99 | T/C | 14.006 | COL21A1 | Intron |
SSC8:36710706 | 8 | 36710706 | 0.08 | 7.50E-06 | 5.12 | G/T | 11.22 | NA | NA | |
ABFT | AEMK02000449.1:179574 | AEMK02000449.1 | 179574 | 0.12 | 1.73E-06 | 5.76 | C/G | 6.138 | OR4P4L | U:56,372 |
OR4C13L | U:69,382 | |||||||||
OR5D14 | D:96,469 | |||||||||
OR5D13L | D:83,741 | |||||||||
SSC7:21392136 | 7 | 21392136 | 0.11 | 1.01E-05 | 5.00 | T/C | 12.82 | POM121L2 | U:50,008 | |
ZNF391 | Intron | |||||||||
ZNF184 | D:21,214 | |||||||||
SSC7:21466553 | 7 | 21466553 | 0.12 | 1.05E-06 | 5.98 | C/A | 14.59 | ZNF391 | U:66,090 | |
ZNF184 | U:31,328 | |||||||||
SSC7:29486003 | 7 | 29486003 | 0.12 | 2.12E- 07* | 6.67* | T/C | 15.416 | COL21A1 | Intron | |
SSC7:29503670 | 7 | 29503670 | 0.14 | 3.26E- 08** | 7.49** | T/C | 16.855 | COL21A1 | Intron | |
SSC7:30292654 | 7 | 30292654 | 0.14 | 6.58E-06 | 5.18 | G/A | 12.635 | GRM4 | U:13,533 | |
SMIM29 | D:36,896 | |||||||||
NUDT3 | D:27,653 | |||||||||
HMGA1 | D:27,800 | |||||||||
SSC13:197751407 | 13 | 197751407 | 0.08 | 2.46E-06 | 5.61 | G/A | 8.815 | MRPS6 | Intron | |
SLC5A3 | U:32,711 | |||||||||
67RBFT | AEMK02000449.1:179574 | AEMK02000449.1 | 179574 | 0.12 | 4.84E-06 | 5.32 | C/G | 7.11 | OR4P4L | U:56,372 |
OR4C13L | U:69,382 | |||||||||
OR5D14 | D:96,469 | |||||||||
OR5D13L | D:83,741 | |||||||||
SSC9:8763434 | 9 | 8763434 | 0.38 | 8.00E-06 | 5.1 | C/G | 10.151 | LIPT2 | D:74,668 | |
KCNE3 | D:47,288 | |||||||||
PGM2L1 | UPS | |||||||||
P4HA3 | U:95,660 | |||||||||
ENSSSCG00000042110 | U:60,242 | |||||||||
SSC9:8763503 | 9 | 8763503 | 0.38 | 9.62E-06 | 5.02 | A/G | 10.01 | LIPT2 | D:74,599 | |
KCNE3 | D:47,219 | |||||||||
PGM2L1 | UPS | |||||||||
P4HA3 | U:95,729 | |||||||||
ENSSSCG00000042110 | U:60,311 | |||||||||
SSC3:34308396 | 3 | 34308396 | 0.06 | 2.81E-06 | 5.55 | G/A | 5.416 | NA | NA | |
SSC12:50896936 | 12 | 50896936 | 0.11 | 4.80E-06 | 5.32 | G/A | 9.537 | PITPNM3 | 3’UTR | |
PIMREG | DOWNS | |||||||||
AIPL1 | D:6,502 | |||||||||
CLP | AEMK02000598.1:813208 | AEMK02000598.1 | 813208 | 0.38 | 2.95E-06 | 5.53 | T/C | 14.179 | OR1J4L | DOWNS |
OR1J4L | D:88,150 | |||||||||
SSC9:8744721 | 9 | 8744721 | 0.08 | 9.18E-06 | 5.04 | C/T | 13.264 | LIPT2 | D:93,381 | |
KCNE3 | D:66,001 | |||||||||
PGM2L1 | Intron | |||||||||
P4HA3 | U:76,947 | |||||||||
ENSSSCG00000042110 | U:41,529 | |||||||||
SSC14:38533013 | 14 | 38533013 | 0.25 | 8.13E-06 | 5.09 | G/A | 4.588 | SDS | U:19,133 | |
LHX5 | U:75,836 | |||||||||
SDSL | U:48,858 | |||||||||
PLBD2 | Intron | |||||||||
DTX1 | D:27,847 | |||||||||
RASAL1 | D:65,256 | |||||||||
SSC10:46524739 | 10 | 46524739 | 0.45 | 5.52E-06 | 5.26 | G/A | 8.98 | FAM171A1 | Intron | |
SSC11:47266057 | 11 | 47266057 | 0.5 | 2.01E-06 | 5.7 | C/G | 11.239 | NA | NA | |
SSC11:47266119 | 11 | 47266119 | 0.42 | 1.98E-06 | 5.7 | G/A | 10.719 | NA | NA | |
CFP | SSC10:46524739 | 10 | 46524739 | 0.45 | 7.61E-06 | 5.12 | G/A | 10.873 | FAM171A1 | Intron |
SSC11:47266057 | 11 | 47266057 | 0.5 | 6.72E-06 | 5.17 | C/G | 9.643 | NA | NA | |
SSC11:47266119 | 11 | 47266119 | 0.42 | 2.51E-06 | 5.6 | G/A | 10.736 | NA | NA | |
SSC10:50888840 | 10 | 50888840 | 0.4 | 4.26E-06 | 5.37 | T/G | 9.072 | KIAA1217 | Intron | |
1Description of the traits is in Table 1 SBFT Backfat thickness at the shoulder LRBFT Backfat thickness at the last rib LBFT Backfat thickness at the last lumbar ABFT Average backfat thickness 67RBFT Backfat thickness at 6–7 ribs CLP Carcass lean percentage CFP Carcass fat percentage 2The chromosome of SNP is combined with location information 3Sus scrofa chromosome 4Positions of the significant SNP according to the Sus Scrofa Build 11.1 assembly 5Minor Allele Frequency 6Genome-wide significant associations are underlined 6,7*and** represented the 10% and 1% genome-wide significance, respectively. 8Phenotypic Variation Explain 9The gene located with 100Kb upstream and downstream of the significant SNP 10UPS Upstream (5’ of the gene) DOWNS Downstream (3’ of the gene) U/D represented the nearest gene located upstream or downstream of the SNP (Intergenic region). |
Interestingly, for BFT traits, six SNPs were associated with at least two BFTs at the significance level (P ≤ 1.0E-05) among the significant SNPs. Among these, SSC7:21392136 and SSC7:29486003 were found to be significantly associated with SBFT (P = 2.31E-06, P = 5.29E-06) and ABFT (P = 1.01E-05, P = 2.21E-07). SSC7:21392136 could explain 11.182% and 12.82% of the phenotypic variation (PV) for SBFT and ABFT, respectively. SSC7:29486003 could explain 10.845% and 15.416% of the PV for SBFT and ABFT, respectively. SSC7:21466553 was significantly associated with LRBFT (P = 1.11E-06) and ABFT (P = 1.05E-06). SSC7:21466553 could explain 15.435% and 14.59% of the PV for LRBFT and ABFT, respectively. SSC7:29503670 was significantly related to four BFT traits, including SBFT (P = 3.94E-06), LRBFT (P = 8.98E-06), LBFT (P = 1.02E-06) and ABFT (P = 3.26E-08). The locus could explain 10.549%, 12.749%, 14.006% and 16.855% of the PV for SBFT, LRBFT, LBFT and ABFT, respectively. These results were shown in Table 2. Above four candidate SNPs, SSC7:21392136 and SSC7:21466553 have located apart 74.417 kb each other, while SSC7:29486003 and SSC7:29503670 have located apart 17.667 kb each other, which meant that the responsible gene may locate near here. The results showed that the adjacent and within genes of the SSC7:21392136 and SSC7:21466553 were ZNF184 and ZNF391, while SSC7:29486003 and SSC7:29503670 were located in the intron region of COL21A1. Besides, the fifth shared SNP AEMK02000449.1:179574 (unmapped to the porcine reference chromosome) was significantly associated with ABFT (P = 1.73E-06) and 67RBFT (P = 4.84E-06). The locus could explain 6.138% and 7.11% of the PV for ABFT and 67RBFT, respectively. Furthermore, four genes, including OR4P4L, OR4C13L, OR5D14, and OR5D13L were found within the ± 100 kb region of the locus. The sixth shared SNP of SSC3:34308396 was significantly associated with LRBFT (P = 2.80E-06) and 67RBFT (P = 2.81E-06). However, there were no genes found nearby the locus (Table 2).
In the remaining SNPs, SSC15:117022248 and SSC8:36710706 were unique significant SNP loci for SBFT (P = 3.42E-06) and LBFT (P = 7.50E-06), respectively. BARD1 and ENSSSCG00000048197, located near SSC15:117022248 were found. There were no genes found within the ± 100 kb region of SSC8:36710706. Besides, SSC7:30292654 and SSC13:197751407 were unique for ABFT (P = 6.58E-06, P = 2.46E-06). Four genes, including GRM4, SMIM29, NUDT3 and HMGA1, were located within 100 kb upstream and downstream of SSC7:30292654. The adjacent and within genes of the SSC13:197751407 were MRPS6 and SLC5A3. Three SNPs, including SSC9:8763434, SSC9:8763503 and SSC12:50896936 were exclusive for 67RBFT (P = 8.00E-06, P = 9.62E-06, P = 4.80E-06). Five genes, including LIPT2, KCNE3, PGM2L1, P4HA3 and ENSSSCG00000042110 were found nearby SSC9:8763434 and SSC9:8763503. Interestingly, the five genes were also found near SSC9:8744721, which was significantly associated with CLP (P = 9.18E-06). Among the five genes, PGM2L1 was the closest gene to the three loci on SSC9. Finally, the adjacent and within genes of the SSC12:50896936 were PITPNM3, PIMREG and AIPL1. These results were shown in Table 2.
For the traits of CLP and CFP, three shared SNP loci, including SSC10:46524739, SSC11:47266057 and SSC11:47266119 were significantly associated with CLP (P = 5.52E-06, P = 2.01E-06, P = 1.98E-06) and CFP (P = 7.61E-06, P = 6.72E-06, P = 2.51E-06). SSC10:46524739 could explain 8.98%, and 10.873% of the PV for CLP and CFP, respectively. The locus was located in the intron region of FAM171A1. SSC11:47266057 could explain 11.239% and 9.643%, while SSC11:47266119 could explain 10.719% and 10.736% of the PV for CLP and CFP, respectively. Nevertheless, there were no genes found nearby the two loci. Furthermore, AEMK02000598.1:813208 (unmapped to the porcine reference chromosome) and SSC14:38533013 were unique for CLP, while SSC10:50888840 was exclusive for CFP. OR1J4L was located near the locus of AEMK02000598.1:813208. Six genes, including SDS, LHX5, SDSL, PLBD2, DTX1 and RASAL1 were found nearby the SNP OF SSC14:38533013. Among these six genes, the intron region of PLBD2 contained the locus. Finally, the SNP of SSC10:50888840 was located in the intron region of KIAA1217. These results were shown in Table 2.
Go analysis showed that ZNF184 and ZNF391 were involved in regulation of transcription, DNA-templated and nucleic acid binding, etc. COL21A1 was involved in collagen trimer and extracellular matrix. Three ORs, including OR4P4L, OR4C13L, and OR1J4L were mainly enriched in G-protein coupled receptor signaling pathway and G-protein coupled receptor activity, etc. PLBD2 was enriched in lipid catabolic process and hydrolase activity. GO analysis showed that PGM2L1 was enriched in carbohydrate metabolic process, phosphorylation and glucose-1,6-bisphosphate synthase activity, etc. FAM171A1 significantly enriched BP term was regulation of cell shape. These results were shown in Table S3.
In the present study, we performed a GWAS based on SLAF-seq to screen and select candidate SNPs for BFT, CLP and CFP traits on the four-way crossbred population with Landrace, Yorkshire, Duroc and Saba pigs as the hybrid parents. SLAF-seq as a high-resolution method for genome-wide genotyping is considered as an enhanced reduced representation sequencing, which has several significant advantages: accurate genotyping by depth paired-end sequencing, reduced sequencing costs, pre-design strate for fragmentation efficiency, and double barcode system for large populations [24]. In the study, a total of 1,552,377 SLAFs from the pig reference genome (Sscrofa 11.1_102) were predicted using RsaI and HaeIII enzyme combination. With an average sequencing depth of 11.94-fold, a total of 10,784,484 SNPs were obtained from 223 crossbreds using SLAF-seq and a Q30 value reached 90.74%. After filtering, 227, 921 SNPs distributed among 38 pig chromosomes genome were used to perform GWAS for BFT, CLP and CFP traits, which were nearly three times the number of SNPs on Illumina PorcineSNP80 Genotyping BeadChip. Furthermore, compared with Genotyping BeadChip, SLAF-seq can provide new variation information and is more suitable for the identification of new variation sites. Currently, SLAF-seq was successfully applied to pig genotyping and identified a large number of new mutation sites [31, 32]. Considering the huge differences in genomic variation between Chinese and Western pig breeds, our results showed that SLAF-seq was a powerful method and had great potential for further study in more pig breeds. Therefore, SLAF-seq method could be considered a more competitive choice in pig genome research.
Currently, 70, 273, 58, 586 and 15 QTLs associated with SBFT, LRBFT, LBFT, ABFT and 67RBFT, respectively, deposited in the pig QTL database (http://www.animalgenome.org/ cgi-bin/QTLdb/index, Dec 27, 2021). In the study, a total of 14 SNPs were identified as significant for the BFT traits investigated. Among the detected SNPs, four, three, two, seven and five SNPs, were significantly associated with SBFT, LRBFT, LBFT, ABFT and 67RBFT, respectively. For BFT traits, SNPs identified in the study were mainly distributed in SSC3, SSC7, SSC8, SSC9, SSC12, SSC13 and SSC15, which showed that the genetic structure affecting BFT traits was complex. Among significant SNPs, one SNP (SSC7:30292654) associated with ABFT was located in the QTL region that has been reported in the White Duroc × Erhualian F2 population [7]. Besides, four shared SNPs related to at least two BFT traits, including SSC7:21392136, SSC7:21466553, SSC7:29486003 and SSC7:29503670 were also identified on SSC7. The four loci were located in the QTL region that has been reported in the five domesticated modern populations (Large White, Duroc-Large White synthetic, Yorkshire/Large White, and Landrace) [33]. In the study, a total of five SNPs significantly associated with BFTs were found on SSC7 (between 21.39Mb and 30.29Mb). A lot of previous studies have shown that QTLs related to BFT traits were located in the region near the five loci on SSC7 [8, 34–36].
Furthermore, the locus of SSC15:117022248 related to SBFT was located between microsatellites SW120 and SW906, which has a high density of QTLs associated with fat content [37]. The SNP at SSC3:34308396 associated with LRBFT and 67RBFT was located in the QTL region reported in the Duroc-Pietrain population [38], and SSC8:36710706 related to LBFT was located in the QTL region and has been reported in White Duroc × Erhualian intercross population [5]. SSC13:197751407 associated with ABFT was located in the QTL region (195.0-206.7Mb), and has been reported in the F2 Duroc × Pietrain population [39]. Two SNPs at SSC9:8763434 and SSC9:8763503 related to 67RBFT were located in the QTL area, which has been reported in Landrace [40] and Meishan × Large White populations [41]. The SNP at 50896936 bp on SSC12 associated with 67RBFT was located in the QTL region (23.7-60.0Mb) reported in the Duroc-Pietrain population [42].
For CLP and CFP traits, 265 and 58 QTLs have been accumulated respectively in the pig QTL database (http://www.animalgenome.org/cgi-bin/QTLdb/index, Dec 27, 2021). In the study, a total of seven SNPs were identified as significant for the CLP and CFP traits. Among the detected SNPs, six and four SNPs, were significantly related to CLP and CFP traits, respectively. For CLP and CFP traits, SNPs were mainly distributed in SSC9, SSC10, SSC11 and SSC14. The SNP at SSC10:46524739 associated with CLP and CFP traits located in the QTL region (46.08-46.98Mb) of feed conversion ratio that has been reported in Duroc pigs [43]. Besides, two SNPs at SSC11:47266057 and SSC11:47266119 associated with CLP and CFP traits located in the QTL region and SSC9:8744721 associated with CLP traits located in the QTL region were reported in F2 Duroc × Pietrain populations [44]. SSC14:38533013 related to CLP was located in the QTL region of average backfat thickness that has been reported in Chinese Meishan × Dutch Large White populations [45] and populations of backcrossing F1 Meishan-White composite females to either Meishan or White composite boars [35]. Finally, SSC10:50888840 associated with CFP was located in the QTL region of shoulder subcutaneous fat thickness and has been reported in Meishan × Large White populations [41].
Of the 21 significant SNPs associated with BFT, CLP and CFP traits found in four-way crossbred pigs in our study, except for two significant SNPs (unmapped to the porcine reference chromosome) that have not been reported in previous studies, other SNPs were located in previously reported QTLs. All significant SNPs can be used as new candidate loci for future studies of porcine BFT, CLP and CFP traits. However, more validations need to be conducted in more pig breeds.
Our results indicated that some loci might have pleiotropic effects on different traits. Interestingly, A 74,417 Kb interval (from 21.39 to 21.46 Mb) on SSC7 was shown to contain two SNPs that were associated with three BFT traits, including SSC7:21392136 for SBFT (P = 2.31E-06) and ABFT (P = 1.01E-05), and SSC7:21466553 for LRBFT (P = 1.11E-06) and ABFT (P = 1.05E-06). There was also an interval with a length of 0.8Mb (from 29.48 to 30.29 Mb) on SSC7, which was shown to contain three SNPs that were associated with four related traits, including SSC7:29503670 for SBFT (P = 3.94E-06), LRBFT (P = 8.98E-06), LBFT (P = 1.02E-06) and ABFT (P = 3.26E-08), SSC7:29486003 for SBFT (P = 5.29E-06) and ABFT (P = 2.12E-07), and SSC7:30292654 for ABFT (P = 6.58E-06). Coincidently, Qiao et al. also reported several significant SNPs for BFT traits on SSC7 in the 2.1Mb region between 33.30 Mb and 35.42 Mb [7]. On SSC9, two SNPs including SSC9:8763434 and SSC9:8763503 were associated with 67RBFT (P = 8.00E-06, P = 9.62E-06), and were close to SNP SSC9:8744721 which was associated with CLP (P = 9.18E-06). The result of the correlation analysis showed that the correlation coefficient between 67RBFT and CLP was negative and highly significant (r = − 0.47, P < 0.001) (Fig. S2). Three adjacent SNPs associated with both traits were within a region of 0.19Mb (between 87.44Mb and 87.63Mb) on SSC9, which indicated that this region might contain a pleiotropic QTL (Table 2).
Furthermore, the correlation coefficient between CLP and CFP was negative and highly significant (r = − 0.86, P < 0.001) (Fig. S2). Interestingly, two contiguous SNPs on SSC11, including SSC11:47,266,057 and SSC11:47,266,119 were significantly associated with both traits. Besides, a 4.36 Mb region (between 46.52 and 50.88 Mb) on SSC10 was associated with CLP and CFP, which indicated that the region might contain a pleiotropic QTL (Table 2).
The GWAS result showed that six SNPs were associated with at least two BFTs at the significance level (p < 1.0E-05). The adjacent and within genes of the SSC7:21392136 and SSC7:21466553 were ZNF184 and ZNF391. GO analysis showed that ZNF184 and ZNF391 participated in regulation of transcription, DNA-templated and nucleic acid binding, etc. As is known, zinc-finger proteins (ZFPs) represent the largest transcription factor family in mammals [46]. These two genes belong to ZFP transcription factors, which play a crucial role in regulating diverse growth and development processes through nucleic acid binding and transcription activation, etc. [47]. An increasing number of ZFPs involved in adipogenesis have been discovered, such as zinc finger protein 423 (Zfp423) [48, 49], Zfp467[50, 51], Zfp521[52], ZNF395[53]. A large number of ZFPs play roles in preadipocyte differentiation, such as ZNF638[54], GATA2[55], GATA3[55] and SLUG [56], which belong to the C2C2-type zinc finger protein subfamily consisting of a highly conserved zinc finger DNA binding domain. It is known that ZNF184 and ZNF391 also are the C2H2-type zinc finger transcriptional factors. According to their structure and function, it is inferred that ZNF184 and ZNF391 might be involved in regulating adipogenesis as C2H2-type transcriptional factors. In addition, SSC7:29486003 and SSC7:29503670 were located in the intron region of COL21A1. The results of the functional analysis showed that COL21A1 was enriched in collagen trimer and extracellular matrix (ECM). It has been reported that COL21A1 is VAdomain-containing collagen with a domain structure and which is a part and the smallest of the FACIT family of collagen expressed in various tissues including heart, stomach, placenta, skeletal muscle, kidney and liver [57, 58]. The co-expression of collagen XXI and collagen I in tissues and muscles has an important role in the organization of interstitial collagen fibrils by connecting them to other matrix components or cells [57, 59]. A study finds that adipocyte differentiation is regulated by the deposition of collagen which is the major component of ECM [60]. So, COL21A1 plays role in ECM architecture and function. The ECM remodeling is associated with the modulation of adipogenesis during adipose tissue expansion [61]. Besides, Tanaka et al. study the structure of pig genome region containing fat-related QTLs by constructing bacterial artificial chromosome (BAC) overlap from DST to SRPK1 region on SSC7. The analysis of nucleotide sequences of three BAC clones including rearrangement points shows that COL21A1 and DST are located near the rearrangement point, and they do not exist in the corresponding human region [62]. Our findings would help to further understand the role of COL21A1 in backfat metabolism and this gene should be considered a strong candidate gene for BFT traits.
Furthermore, four genes, including OR4P4L, OR4C13L, OR5D14, and OR5D13L were located within 100 kb upstream and downstream of AEMK02000449.1:179574, which was significantly associated with ABFT and 67RBFT. Interestingly, OR1J4L was found nearby the AEMK02000598.1:813208, which was significantly related to CLP. In the study, GO analysis showed that OR4P4L, OR4C13L and OR1J4L were mainly enriched in G-protein coupled receptor signaling pathway and G-protein coupled receptor activity, etc. These five genes belong to the olfactory receptor (OR) gene family. The ORs are G protein-coupled receptors that recognize a large number of organic compounds in accordance with their homologous ligand, which have been found to be expressed in different organs or tissues in mammals [63, 64]. The ORs expressed in extra-nasal tissues are involved in a variety of biological processes, including the regulation of glucose and lipid metabolism [65]. Therefore, it could be inferred that these five OR genes might be involved in regulating the energy metabolism and lipid metabolism of pigs to affect BFT and CLP traits of pigs. However, there were no genes found within ± 100kb surround of SSC3:34308396, which was significant associations with LRBFT and 67RBFT. As above, these seven significant SNPs and eight genes might be important for the fatness of the pigs and be potential targets of BFT traits for further molecular breeding research.
In the remaining significant SNPs, BARD1 and ENSSSCG00000048197, located near SSC15:117022248 were found. Enrichment analysis showed that the BARD1 gene was involved in RNA binding and ubiquitin-protein transferase activity. At present, these two genes were not been reported in fat-related studies. Four genes, including GRM4, SMIM29, NUDT3 and HMGA1, were located within 100 kb upstream and downstream of SSC7:30292654. Of these four genes, HMGA1 can be a candidate gene for the locus, as HMGA1 is functionally related to fat metabolism and its variants have been associated with backfat thickness [7]. Some studies show that HMGA1 can serve as a regulator of IGF1 activity to regulate glucose uptake [66] and can bind with PPARG, a key modulator of adipocyte differentiation and glucose homeostasis [67]. Besides, the adjacent and within genes of the SSC13:197751407 were MRPS6 and SLC5A3. MRPS6 encodes a subunit of mitochondrial ribosomal protein 28s23 [68], while SLC5A3 is a gene embedded in MRPS6, encoding a protein that transports sodium and inositol under hyperosmotic stress [69]. Currently, these two genes were not reported in fat metabolism for further study. Five genes, including LIPT2, KCNE3, PGM2L1, P4HA3 and ENSSSCG00000042110 were nearby the two adjacent SNPs (SSC9:8763434 and SSC9:8763503), which were significantly associated with 67RBFT. Interestingly, the five genes were also found near SSC9:8744721, which was significantly related to CLP. Among the five genes, PGM2L1 was the closest gene to the three loci. GO analysis showed that PGM2L1 was enriched in carbohydrate metabolic process, phosphorylation and glucose-1,6-bisphosphate synthase activity, etc. It is known that PGM2L1 belonging to a distinct family of α-phosphohexomutases widely distributed in prokaryotes can make glucose-1,6-bisphosphate, a key metabolic regulator [70]. PGM2L1 plays an indispensable role in glycogen metabolism, glycolysis and gluconeogenesis as a key enzyme [70]. Our findings provided a new understanding of the role of PGM2L1 in the backfat metabolism of pigs and should be considered a strong candidate gene for BFT and CLP traits. Finally, the adjacent and within genes of the SSC12:50896936 were PITPNM3, PIMREG and AIPL1. PITPNM3 (phosphatidylinositol transfer protein 3) and AIPL1 (Aryl hydrocarbon receptor-interacting protein-like 1) are previously reported in association with vision (i.e., phototransduction [71, 72]). The PIMREG gene may be related to porcine immunity [73]. Alvarenga et al. find that the PITPNM3 gene located on BTA19 and SSC12 is associated with temperament in cattle and struggling bouts (frequency and duration) and latency at an age of 12 days in pigs, which is a key gene related to animal behavioral characteristics [74]. However, the mechanism of these three genes located near SSC12:50896936 on porcine BFT traits needs to be further studied.
For the traits of CLP and CFP, three loci, including SSC10:46524739, SSC11:47266057 and SSC11:47266119 were significantly associated with CLP and CFP traits. SSC10:46524739 could explain 8.98%, and 10.873% of the PV for CLP and CFP, respectively. The SNP locus was located in the intron region of FAM171A1. GO analysis showed that the FAM171A1 gene was significantly enriched in regulation of cell shape (Table S3). The FAM171A1 gene is a member of the family of sequence similarities, which is responsible for encoding the APCN/FAM171A1 protein. APCN/FAM171A1 is an evolutionarily conserved 98 kDa transmembrane type I glycoprotein expressed in various cells, which is involved in the regulation of the cytoskeletal dynamics and thereby the cell shape [75]. According to its function and extensive expression in various cells, perhaps FAM171A1 could be considered a candidate gene for the porcine CLP and CFP traits and further studied its effects on the growth, proliferation and differentiation of muscle cells and fat cells. However, no genes were found within 100kb upstream and downstream of the SSC11:47266057 and SSC11:47266119, but these two loci could explain about 10% of the PV for CLP and CFP, respectively. It was inferred that these two SNPs might be important for the CLP and CFP of the pigs and might be potential targets for further molecular breeding research.
Six genes, including SDS, LHX5, SDSL, PLBD2, DTX1, and RASAL1 were found nearby SSC14:38533013 associated with CLP. Among these six genes, the intron region of PLBD2 contained the SNP loci. GO analysis showed that PLBD2 was enriched in lipid catabolic process and hydrolase activity. Wang et al. find that PLBD2 and other nine genes, predominantly participate in the lipid catabolic process, which is in agreement with the favoured selection for fatness in Chinese pigs through analysis of the selected regions for Tongcheng pigs using the selective sweep method [76]. The PLBD2 gene should be considered a strong candidate gene for porcine CLP traits. Finally, the SNP of SSC10:50888840 related to CFP traits was located in the intron region of KIAA1217. A study finds that the KIAA1217 gene shows differences in promoter methylation and mRNA expression in the omental visceral adipose tissue between non-obese and obese individuals [77]. Perhaps, the KIAA1217 gene might be used as a candidate gene for the porcine CFP trait to further study its effects on lipid metabolism.
However, these identified loci and genes need to be further verified in more pig populations, and their functions also need to be validated by more biological experiments in pigs.
In this study, we employed SLAF-seq to perform GWAS for seven fatness-related traits in 223 four-way crossbred pigs. Using the MLM model, we identified a total of 21 SNP loci displaying a significant association (P ≤ 1.00E-05), including nine SNP loci having co-identification on two or more traits. Next, candidate genes were screened in the 100 kb zone of each of the 21 SNP loci to identify 33 candidate genes possibly related to the seven fatness-related traits. Our subsequent analyses determined that ZNF184, ZNF391, COL21A1, OR4P4L, OR4C13L, OR5D14, OR5D13L, OR1J4L, HMGA1, PGM2L1, FAM171A1, PLBD2 and KIAA121 were candidate genes for carcass backfat thickness, and/or carcass lean percentage, and/or carcass fat percentage for further study. Moreover, these SNP loci and candidate genes might serve as a biological basis for improving these important traits of pigs.
A total of 223 four-way crossbred pigs (108 males and 115 females, DSYLS) used in this study were produced with 7 hybrid boars (Duroc × Saba, DS) and 37 hybrid sows (Yorkshire × (Landrace × Saba), YLS) as the parents. All the animals were reared under the same nutritional and environmental conditions until slaughtered (105.25 ± 15.75 Kg) at the pigs and broilers breeding farm in Chuxiong City, Yunnan Province, China. Ear tissue samples were collected from 223 crossbreds.
In the study, the studied phenotypes were BFT, CLP and CFP traits. BFT traits included backfat thickness at the shoulder (SBFT), backfat thickness at the last rib (LRBFT), backfat thickness at the last lumbar (LBFT), average backfat thickness (ABFT), and backfat thickness at 6–7 ribs (67RBFT). The SBFT was the backfat thickness at the thickest point over the shoulder. The LRBFT was the backfat thickness at the last rib. The LBFT was the backfat thickness at the last lumbar. The 67RBFT was the backfat thickness between the 6th and 7th ribs. SBFT, LRBFT, LBFT and 67RBFT were measured using the vernier caliper. The ABFT was calculated as the average of three measurements: SBFT, LRBFT, and LBFT. The left side of each carcass was dissected by separating bone, muscle, fat, and skin. Each component was individually weighed, and recorded as bone weight (BW), muscle weight (MW), fat weight (FW) and skin weight (SW). CLP and CFP were calculated as follows:
CLP (%) = MW/(BW + MW + FW + SW).
CFP (%) = FW/(BW + MW + FW + SW).
The MEANS procedure of SAS (SAS Institute, Inc., Cary, NC) was used to generate descriptive statistics for studied traits. The sample distribution was visualized as a frequency distribution histogram using the R package “ggpubr”. The phenotypic correlation was visualized as a correlation heatmap by the R function “PerformanceAnalytics”.
Genomic DNA was extracted from ear tissue samples. Total DNA was extracted by the phenol-chloroform extraction method, with concentration and purity measured using the NanodropTM 2000 spectrophotometer (Thermo Scientific, Waltham, MA, USA) and electrophoresis. A simulated restriction enzyme digestion was carried out on the current pig genome (Sscrofa 11.1, ftp://ftp.ensembl.org/pub/release-102/) to identify expected SLAF yield, avoid repetitive SLAFs, and obtain the relatively uniform distribution of restriction fragments in the genome. As a result, genomic DNA was digested with RsaI and HaeIII restriction enzyme combinations. Meanwhile, to assess the experimental procedure, Oryza sativa indica (http://rapdb.dna.affrc.go.jp/) was used as a control for evaluating the effectiveness of enzyme digestion and paired-end mapped reads. In brief, SLAF library construction and sequencing for each individual was conducted as described previously [24] with slight modifications: DNA fragments of 314–344 base pair (bp) were selected as SLAFs and used for paired-end sequencing by Illumina HiSeq 2500 system (Illumina, Inc., San Diego, CA, USA) at Beijing Biomarker Technologies Corporation. Raw sequencing reads were identified by dual-indexing [78] and classified to each sample.
Raw paired-end reads were mapped to the pig reference genome (Sscrofa 11.1_102) using BWA software [79]. In general, SLAF groups produced by reads were mapped to the same position. If an accession was only partially digested by the restriction enzymes, reads mapped to the reference genome should have overlapped with two SLAF tags and were assigned to both of the SLAF tags in the same accession. SNP calling was performed by both GATK and Samtools analysis [80, 81], and a locus was defined as a SNP if it was simultaneously called from these two packages. High-quality SNPs were obtained for further analysis by filtering according to minor allele frequency (MAF: 0.05) and integrity (int: 0.8) by PLINK 2 [82].
Based on the developed high-density molecular marker data, GEMMA software [83] was used for association analysis between traits and SNPs. The mixed linear model (MLM) of GEMMA software was as follows:
y = Wα + xβ + Zu + ε
u ~ MVNm (0, λτ−1K)
e ~ MVNn (0, τ−1 In)
Where n is the number of individuals, m is the number of groups, strains or clusters, y is an n × 1 vector of quantitative traits, W = (w1, w2 … wc ) is an n ×c matrix of covariates (fixed effects) including a column vector of 1, α is a c×1 vector of corresponding coefficients including the intercept, x is an n×1 vector of marker genotypes, β is the effect size of the marker, Z is an n×m loading matrix, u is an m×1 vector of random effects, ε is an n×1 vector of errors, τ−1 is the variance of the residual errors, λ is the ratio between the two variance components, K is a known m×m relatedness matrix, In is an n×n identity matrix and MVN denotes multivariate normal distribution.
In the study, W (fixed effect) was the matrix of population structure calculated by the ADMIXTURE program [84], and Z was the matrix of kinship relationship calculated using GCTA software [85]. X was the vector of the genotype, y was the vector of the traits, µ (random effect) was the vector of the kinship relationship between samples, and ε was the vector of random residual effects. Finally, an association result could be obtained for each variation site. Since the Bonferroni correction (BC) method [83] for multiple testing was too conservative and only a few significant SNPs were detected, markers with adjusted − log10 (P) ≥ 5 (control threshold) were considered as significant for SNP-trait association. Based on the number of SNPs analyzed (n = 227,921), The threshold p-value for genome-wide 1% and 10% significance were 4.39E-08 (0.01/227,921) and 4.39E-07 (0.1/227,921), respectively. Considering the complexity of the traits, markers that passed the threshold score or above the threshold − log10(P) were held to be significantly associated with the target trait.
Potential candidate genes were identified within the ± 100kb region of significant loci using the Ensembl Sscrofa11.1 database (www.ensembl.org) and NCBI database (https://www.ncbi. nlm.nih.gov/). The GO enrichment analysis of candidate genes was then performed using Gene Ontology Consortium (http://www.geneontology.org).
Genome-wide association study
Specific-locus amplified fragment sequencing
Carcass backfat thickness
Backfat thickness at the shoulder
Backfat thickness at the last rib
Backfat thickness at the last lumbar
Average backfat thickness
Backfat thickness at 6–7 ribs
Carcass lean percentage
Carcass fat percentage
Sus scrofa chromosome
Minor Allele Frequency
Quantile–quantile plot
mixed linear model
Phenotypic variation
Phenotypic variation explain
Single nucleotide polymorphism
Quantitative trait loci
Gene ontology
Olfactory receptor
zinc-finger protein.
Acknowledgments
Not applicable.
Authors’ contributions
SXL and YCP conceived and designed the experiments. DWY, XYW, XXD and MLL determined the phenotypic data and collected the sample. HYW performed the experiment and processed and analyzed the data. HS and QC assisted with the processing of data. HYW and XYW wrote the manuscript that was subsequently revised by YCP and SXL. All authors have read and approved the final manuscript.
Funding
This study was supported by the National Natural Science Foundation of China (U1402266), Yunnan Swine Industry Technology System Program (2020KJTX0016) and Yunnan Province Important National Science & Technology Specific Projects (202102AE090039). These funding agencies played no role in the design of the study, data collection, analysis and interpretation, or in writing the manuscript.
Availability of data and materials
The Genome sequencing raw data was deposited in NCBI’s SRA database (https://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi? view=studies&f=study&term= &go=Go; Accession: SRP376933).
Declarations
Ethics approval and consent to participate
All animals used in this study were used according to the guidelines for the care and use of experimental animals established by the Ministry of Agriculture and Rural Affairs of China. The ethics committee of Yunnan Agricultural University (YNAU, Kunming, China) approved the entire study. We confirmed that the study was conducted in accordance with ARRIVE guidelines 2.0 [86].
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Author details
1Faculty of Animal Science and Technology, Yunnan Agricultural University, Kunming 650201, Yunnan, China. 2Faculty of Animal Science, Xichang University, Xichang 615000, Sichuang, China. 3Faculty of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai 200240, China. 4Faculty of Animal Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China.