Characterization of feed efficiency groups
Two groups of eight animals with extreme values of feed efficiency were selected and named High Feed Efficient (HFE, lowest RFI) and Low Feed Efficient (LFE, highest RFI). These groups were significantly different for feed efficiency traits RFI, feed conversion ratio (FCR), residual weight gain (RWG), residual intake and weight gain (RIG), for dry matter intake (DMI) and average daily gain (ADG) (Table 1). The LFE animals presented higher backfat thickness at the end of the experiment (P<0.05), supporting that HFE animals are more feed efficient because they eat less, have similar ADG and are leaner than LFE animals (5,19). A total of 11,361 genes were expressed in liver samples from HFE and LFE animals and eight genes were differentially expressed (DE) (P ≤ 0.1): NR0B2, SOD3, RHOB, Bta-mir-2904-2, FTL, CYP2E1, GADD45G and FASN (5).
Characterization of expressed SNPs and INDELs associated with feed efficiency
The variant calling analysis detected 268,393 and 37,587 SNPs and INDELs, respectively, from the 16 samples. These variants were tested for Minor Allele Frequency (MAF)<0.4 and call rate=50%, which left 68,852 SNPs and 5,148 INDELs, respectively, for statistical analysis. From these, 2,149 SNPs and 139 INDELs were significantly different between HFE and LFE animals (P<0.05). Part of these variants were found in exonic regions (51% for SNPs and 29% for INDELs) but a considerable fraction was found outside the gene (43% for SNPs and 66% for INDELs) or in introns (6% for SNPs and 5% for INDELs, Fig. 1).
We considered the variants with moderate and high impact as potential functional variants totalizing 256 variants (247 SNPs and 9 INDELs) selected for further analysis. They generated effects on proteins as follows: shorter proteins due to frameshift INDELs, bigger proteins due to insertion of one or two AA by inframe insertion INDELs, changes in single amino acid (AA) on proteins due to missense SNPs and a stop retained INDEL (able to change at least one base of the terminator codon but maintaining the function of the terminator). These 256 potential functional variants were distributed across 190 genes, as some variants were found in spliced isoforms and there were genes with more than one variant (Supplementary table 1).
Depicting the biology of potential functional variants
In order to understand the importance of these 256 potential functional variants we performed the functional enrichment analysis for the selected genes, which detected three different significant pathways (False Discovery Rate - FDR<0.10): "Regulation of Complement cascade", "Complement cascade" and "Innate Immune System" with fold enrichments of 14.79x, 12.19x and 2.57x respectively (Table 2).
The three pathways associated with FE by our approach were all related to the innate immune system, in particular with the regulation of the complement cascade. The first two pathways enriched from the same set of six genes (Table 3) and the Innate Immune System enriched from twenty genes (Table 3), including the six genes from the complement cascade/regulation of complement cascade pathways. The list of these 20 genes, their potential functional variants, the impact on protein sequence and the frequency in each group can be found in the Supplementary table 2. Considering the “regulation of complement cascade” and “complement cascade” as one pathway since they were enriched for the same set of genes, it calls the attention the overrepresentation of Complement H factor genes (complement factor H-related 5, complement factor H and complement factor H precursor). There were four different significant variants in these genes and all were homozygous in HFE animals instead of some degree of heterozygosity in LFE animals (Table 4).
Effects of other important potential causal variants
At last, we analyzed the effects of high impact INDELs on protein function since they altered the protein sequence considerably but were not enriched in the processes already described. The frameshift INDEL on GAS2L1 caused a deletion of 7 AA in the calponin homology domain and had an allele frequency of 35% (5/14) in the HFE animals whereas it was not found in LFE group (0/8, P<0.05). Another frameshift INDEL generated a deletion of 7 AAs in the ATP binding cassette (ABC) domain of the TAP2 and had an allele frequency of 50% (7/14) in the HFE group whereas only 25% in the LFE group (3/12, p<0.05). Interestingly, this INDEL was found significantly associated not only in one but also in two spliced isoforms of TAP2. An INDEL in the transcription factor RREB1 caused an insertion of 2 AAs in one of the C2H2 Zinc finger domains of Bos indicus RREB1 and was associated with LFE since the allele frequency in this group was 30% (3/10) but in the HFE no allele was found (0/14, P<0.05). An INDEL was found in SEMA4F, which generated a premature stop codon reducing the COOH-terminal of the SEMA4F in 5 AAs. This region contains the PDZ domain of the protein responsible for anchoring the SEMA4F in the membrane to cytoskeletal components. The allele with the deletion of 40 bp was found in 50% in LFE group (5/10) and 16% in HFE group (2/12, P<0.05). An inframe insertion INDEL in the ECSIT insert a glutamic acid in the position 423 at the COOH terminal of ECSIT and this variant is associated with feed efficiency as the allele frequency was 29% (4/14) in the HFE and 0% (0/12) in the LFE group (P<0.05).
Another inframe insertion INDEL was found in the ENSBTAG00000011926 gene, which is probably the transcription factor ZFN665-like that inserted a leucine in one C2H2 Zn finger region of the protein. The variant with the inserted leucine is associated with lower feed efficiency with an allele frequency of 38% (3/8) instead of 6% (1/15) in the HFE group (P>0.05). Another potential functional variant is the insertion of one Alanine (252) just one position from the phosphorylation serine site of RPP30 (S251) with a higher frequency in the LFE group (LFE=50%, 5/10; HFE=14%, 2/14; p<0.05). The INDEL in the MGAT2 although considered a frameshift INDEL, it did not change the function since it generated a stop retained codon.