QTLR mapping by SDP
Selective DNA pooling was used to map MD QTLRs on the chicken autosomes, using DNA pools of sires selected from 8 elite Hy-Line lines with daughter tests for MD mortality. After consolidating QTLRs within 1 Mb of one another, both within and across lines, 37 QTLRs were found in 7 lines.
No QTLR was identified in Line RIR1. This result is somewhat intriguing, as RIR1 happened to be the line with the strongest selection to the tails, i.e the smallest proportion in the tails and consequently the highest selection intensity (ip/2 = 1.5 in Table S1). However, in spite of the strongest selection, the difference between the high and low pools and the phenotypic variation were not exceptional among the lines (Table S1). Thus, the best explanation for the lack of QTLR in Line RIR1 would be sampling variation and a case of false negative.
The identified QTLRs distributed over 15 autosomes. Finding QTLRs on only some of the chromosomes aligns well with previous studies, including our own in the FSAIL F6 population [14]. It also accords well with the expectation that a limited number of genomic elements have a detectable effect on any trait, and those causative elements are not evenly distributed over the genome. Finally, as always, QTLRs may be missed due to sampling variation (a false negative error).
No QTLR was found on GGA 16, the location of the chicken MHC. The Chicken QTLdb reports a single MD QTL on this chromosome, which was our own report on the FSAIL F6 population [14]. There too we accounted for the MHC genotype. Keeping in mind that sampling variation could cause a false positive error in the F6 study or a false negative here in the SDP, it nevertheless seems that the MHC correction worked well.
Seven QTLRs were tested by individual genotyping. Taking one significant test as a validation, individual genotyping confirmed the association of 5 of the 7 QTLRs tested (Table 4). Note that validation of these QTLRs is based on all sires from the lines from which samples of sires were selected to construct the selected DNA pools. Thus, this is not a completely independent replication test. Hence, the 71% confirmation cannot be taken as high. On the other hand, the number of QTLRs tested by individual genotyping is too small for a firm conclusion.
The number of QTLRs was nearly the same as the 38 found by the other arm of the same project, namely QTL mapping in an FSAIL F6 population [14], suggesting the same power of the two methods. Nevertheless, the size of the QTLRs in the two studies was very different − 0.49 Mb by the pools (0.06–1.37), compared to 1.37 Mb by F6 on GRCg6a (0.24–4.29). Thus, the average size of QTLRs found by the pools was 36% of that identified by the F6, suggesting a much higher resolution of the pure lines and SDP design compared to the F6 of the full-sib advanced intercross.
It is not trivial to compare the two studies. Too many aspects are different, with too many confounding factors. One F6 population on one side vs 8 lines on the other. The F6 was a cross of only two lines with a very different susceptibility to MD (one line related to Line WL4 used here). It was designed for fine QTL mapping with high genetic variance in the population and map expansion over generations [18]. The 8 lines were all inbred. All F6 birds were genotyped individually, while the 8 lines were tested by SDP. Thus, all we can say is that apparently the higher genetic variance and map expansion of F6 on one side, and the number of populations and selective DNA pooling on the other, compensated each other.
Only two within-line QTLRs overlapped with a QTLR in another line, while 35 QTLRs were line-specific (Table S3). Finding a QTLR in only one-line accords well with Smith et al. (2020) [14], where 10 of 19 autosomes with MD QTLRs, showed QTLRs in only one family. As reported previously, many genomic elements affect the response to MD, most with relatively small effect [12]. Thus, identification of QTLRs in a single line may confirm those previous observations that indeed most QTLR effects are too small for identification with the current experimental design.
The limited overlaps of QTLRs found here with previous studies is not a surprise, given the very different designs, including different populations (e.g., the same F6 in Heifetz et al. (2007, 2009) [17, 18] and Smith et al. (2020) [14] vs 8 elite lines here), different genotyping methods (individual genotyping by Yonash et al. (1999) [16], 600K SNP array genotyping by Smith et al. (2020) [14], SDP by Heifetz et al. (2007) [17] and in the present study), different marker types (microsatellites by Heifetz et al. (2007, 2009) [17, 18] and Yonash et al. (1999) [16], SNPs by Smith et al. (2020) [14] and here), and different phenotypes (e.g., age of death or survival by Heifetz et al. (2007) [17] and Smith et al. (2020) [14], various MD traits and MD indexes by Yonash et al. (1999) [16], and daughter mortality in this study).
QTLR allele substitution effect obtained by the pools
The top 3 marker allele substitution effects were used to estimate QTLR allele effects and their contributions to the phenotypic and genotypic variation (cP and cG). These were summed within each line to estimate the variations explained by the QTLRs in that line. QTLR allele effects averaged 2.2% MD-mortality in the progeny (Table 2). Thus, the absolute effect of these individual QTLRs is appreciable - the difference between alternative homozygotes averaged 4.4% daughter mortality. Autosomal QTLRs explained an average of 6.6% and 50.9% of the phenotypic and genotypic variations within lines, which was up to 98.4% of the genetic variation (Table 3). These very high proportions could be a result of sampling variation. Nevertheless, they are impressive, indicating the potential benefit of integrating these QTLRs in breeding programmes for MD resistance.
Vallejo et al. (1997) [19] and Yonash et al. (1999) [16], both using the same resource F2 population and microsatellites, mapped QTLs of MD traits and MD indexes. Based on their results, individual QTL explained an average of 3.8% 4.2% of the trait phenotypic variation, almost 4 times the average of 1.1% found here (Table 2). The summed phenotypic variation explained by the QTLs averaged 9.0% and 17.9%, which is 1.5 to almost 3 times the sum found here (Table 3). However, the relatively large estimates of those two studies are likely to be upwardly biased, due to their very limited power based on very sparse microsatellites compared to the high-density SNP array used here. As has been shown previously, when power is low, allele substitution effects are biased upwards [20].
Allele effect of QTLR markers estimated by individual genotyping
Individual genotypes of QTLR markers were used to obtain allele substitution effects. Significant markers were used to obtain contributions to the phenotypic (cP) and genotypic (cG) variations within lines. According well with the estimated allele effects based on pool genotypes, appreciable contributions were found, with means of 0.046 and 0.356 for cP and cG respectively (Table S11). These values are comparable to the cP and cG means of 0.066 and 0.509 estimated by the pools (Table 3). The pools and the individuals were each tested by very different platforms – selected DNA pools genotyped by SNP array vs individual genotyping of all sires. The calculation methods were also very different – allele intensities of SNP microarray genotypes in the pools vs individual genotypes. Finally, only a few of the QTLRs were tested by individual genotyping of just a few elements each. Thus, the similarity of the allele effects is impressive, mutually validating these results.
Linkage disequilibrium among QTLR markers
LD was calculated for all possible pairs of markers on the same chromosome, and LD blocks were identified. Moderate to high LD blocks were found in all lines. Somewhat different LD patterns were found among the 8 lines on the same chromosome. The sources of these differences could partly be due to different marker informativity in the various lines (and hence different markers listed in the analyses), or result from true differences in population structure, as a consequence of different histories of each line.
A mix of very high and practically no LD values was often found between adjacent markers, even over very short distances. Skipping non-significant markers, again we found interdigitated and fragmented LD blocks (e.g., Table 5), as we reported previously [15, 21]. Such a mix could result from assembly errors [22]. However, assuming few assembly errors, they would be expected to be expressed mostly as high LD among the majority of adjacent markers, and islands of singleton markers with low LD with the surrounding markers. The groups rather than single markers consistently showed repeating LD values, suggesting that the complex blocks are not an artefact, but a true biological phenomenon. Furthermore, the repeating of such complex blocks in 8 independent populations here, and in previous reports [15, 21, 23, 24, 25], strengthens the claim that this is a true phenomenon. It could have genuine biological meaning through processes such as gene conversion, population bottlenecks, non-random mating and more.
In accord with our previous report, the blocks fitted well with the distribution of the P, |α|, cP and cG values, justifying the use of them to infer location and identify causative elements. For example, in Table 5 the mutation causing the significant test of Marker MD04_04C, is not likely near the immediate upstream 3.3 Kb non-significant Marker MD04_04B with no LD between them (r2 = 0.116), but more likely 10.7 Kb upstream, near Marker MD04_04A with similar p-value and complete LD between them (r2 = 1.000).
Testing QTLRs by individual genotyping
Seven of the QTLRs were tested by individual genotyping of sires at markers residing in QTLR genomic elements, candidate to affect MD response by their location (QTLR) and reported function. Association tests, in silico investigation and LD analysis were used to assess the potential candidate elements and to narrow possible location of causative mutation. Detailed discussion of QTLRs 17 and 28 is presented in the following sections. The remainder of the tested QTLRs are presented in the supplementary materials.
QTLR 17 on GGA4
QTLR 17 was found by the pools in Line WL5 (Table 1), but was significant by individual genotyping only in Line WL2 and Across Lines (Table S10). Line WL5, however, was tested individually by only 4 of the 9 markers tested in this QTLR, with one marker approaching significance. Thus, the lack of significance by individual genotyping in this line could be a matter of sampling variation or marker informativity.
In the significant WL2 Line, the above mentioned fragmented interdigitated high and moderate LD blocks were found (Table 5). The 4 significant markers created a moderate to high LD block, of which the 3 most significant were in a high LD block, surrounding a high LD block of 2 non-significant markers.
Though the 4 significant markers in Line WL2 comprised an LD block, they included 2 different genes: 2 markers from the CCDC111 gene (PRIPOL; Primase and DNA Directed Polymerase), and 2 from CASP3 (Caspase 3) (Tables S8 and S9). In humans, CCDC111 is involved in DNA damage response and adaptive response to cisplatin, a chemotherapeutic that causes reversal of replication forks in cancer cells [26]. CASP3 was associated in human with monocytic leukaemia and colonic adenocarcinoma [27].
Given the very high LD among these 3 markers, the similar P-values, and their function, both genes are candidates as a causative element. However, among the highly significant markers, only MD04_03A is an amino acid substitution, between the similar isoleucine and leucine at position 199 in the CCDC111 gene (Table S9). The similarity between the 2 alternative amino acids makes this polymorphism unlikely to change the protein function. Marker MD04_04A is intronic in the CASP3 gene, and MD04_04C is downstream of CASP3. Again, these locations make both unlikely to be the causative mutation.
The actual causative element(s) could be part of CCDC111 or the CASP3 genes, for example a regulatory mutation in the promoter, or could be independent element(s) in high linkage with them. More study is needed to confirm.
QTLR 28 on GGA9
QTLR 28 was found by the pools in Line WL2 (Table 1), and was also significant by individual genotyping, along with Lines WPR1, WL5, RIR1, and also Across Lines (Table S10). QTLR 28 was tested by 20 markers (Table 4), distributed unevenly over the QTLR. Inspecting marker locations and distances between them (Table S10, Fig. 1), split this QTLR into 3 regions: 2 markers in the upstream region at 12.6–12.7 Mb (Markers MD09_01B, MD09_01A), 11 markers in the central region at 14.0–14.1 Mb (Markers MD09_02A – MD09_06C), and 7 markers in the downstream region at 16.1–16.5 Mb (Markers MD09_07C - MD09_08D).
Figure 1goes here.QTLR 28 on GGA 9 in Line WL2.
All 8 upstream and central markers tested in Line WL2 were significant, but none of the remaining 4 downstream markers were (Table 6, Fig. 1). Location, P-values and LD blocks accorded well with one another (Table 6). The 2 significant markers of the upstream DLG1 gene formed one simple high LD block. Six significant markers of the central GMNC gene, micro-RNA gga-mir-1762 and the gene TMEM207, formed another simple high LD block. Finally, of the 4 downstream non-significant markers, the 2 most non-significant in the gene EPHB formed yet a third, but fragmented, high LD block. These results suggest two causative elements distributed in Line WL2; one in or close to the upstream gene DLG1, and another in the neighbourhood of the central three significant elements.
A stretch of 9 significant and 1 approaching-significance markers in Line WPR1 distributed in the central region of QTLR 28, from GMNC to TMEM207 (Table S10). The first 2 upstream and the last 2 downstream markers tested in this line were not significant. Similar, albeit somewhat more complex distributions of locations, P-values and LD blocks (Table S12), suggest that only a central causative element of Line WL2 is distributed in this line. In Line WL5 too, a central stretch of significant markers (Table S10) formed a simple high to moderate LD block (Table S12), again suggesting a single central causative element, likely shared with Lines WL2 and WPR1.
Line RIR1 differs from the previous 3 significant lines. The 4 significant markers included 1 upstream, 1 central, and 2 downstream markers. The upper significant Marker MD09_01B had practically no LD with any other marker in this QTLR, the central significant Marker MD09_06A had very low to moderate LD with the central block of markers and the 2 downstream significant Markers MD09_08B and MD09_08C were in complete LD with one another. Thus, Line RIR1 may be displaying the upper and central causative elements found in the 3 previous significant lines, and also an additional downstream causative element.
The Across Lines significant tests distributed over the entire QTLR, mixed with non-significant markers, overall enhancing the presence of 3 causative elements in this QTLR, namely an upstream element distributed in Lines WL2 and RIR1, a central element distributed in Lines WL2, WPR1, WL5 and RIR1, and a downstream element distributed only in Line RIR1.
The upstream region includes the gene DLG1 (Discs Large MAGUK Scaffold Protein 1). This gene has, among others, a role in signal transduction, cell proliferation, synaptogenesis and lymphocyte activation [28]. Both markers tested in this region were non-coding.
The central region includes the gene GMNC, an intergenic marker, micro-RNA gga-mir-1762, and the TMEM207 gene. GMNC (Geminin Coiled-Coil Domain Containing) has been predicted to affect chromatin binding activity, DNA replication, cell cycle, and cilium assembly [29]. Both markers tested in this region were non-coding. gga-mir-1762 (Gallus gallus (chicken) microRNA 1762) is where the most significant results were obtained in Lines WL2, WPR1 and WL5. Unfortunately, there is currently no known function for this miRNA. TMEM207 (Transmembrane Protein 207) has been associated with cancer in human [30]. This association makes it a prime causative candidate in this region, and indeed in Line RIR1 Marker MD09_06A in this gene was the only significant marker in the central region (Table S10). Two of the 3 markers tested in this gene are silent mutations (MD09_06A, MD09_06B), while MD09_06C lies in the 3’ region of this gene (Table S9).
LAMP3 (Lysosomal Associated Membrane Protein 3) was the only gene significant in the downstream region of QTLR 28, and only in Line RIR1 and Across lines tests. This gene is involved in the unfolded protein response (UPR) that contributes to protein degradation and cell survival during proteasomal dysfunction, autophagic process, dendritic cell function and adaptive immunity against microbial and viral infections [31].
Two markers were significant in Line RIR1 (Table S10): Marker MD09_08B involved substitution between the nonpolar and hydrophobic amino acids glycine and proline, and Marker MD09_08B which is a silent mutation. Thus, both markers are unlikely to represent a causative mutation, although variation in a currently undefined regulatory element is possible.