Improving oil quality and increasing yield per hectare in oil palm is a major concern in the oil processing industry. Agrosavia’s palm breeding program has focused on developing interspecific hybrids of OxG which present heterosis in traits such as resistance to diseases, fruit number, fruit weight, leaf length, and trunk diameter [30]. To our knowledge, this study represents the first GWAS analysis of an OxG hybrid population.
Phenotypic data
Correlation analysis results among yield-related traits indicated that BN could be a potential and better selection criterion for production than BW in the hybrid population. It is known that the leaf emission rate determines the BN in a palm, which is negatively correlated with the number of leaves [31]. In our study, no significant correlations between yield and leaf-related traits (FA, LA, LDW, LXL, RL) were found, however, a previous study in E. oleifera and OxG hybrids found that BN can be higher than the number of leaves just when oil palms produced multiple inflorescences [32]. Increases in BN and BW are also expected to correlate with increased mesocarp and kernel oil yields, as shown in other oil palm germplasm studies [33]. Future studies directed to improve oil yields for the OxG hybrid population should be conducted considering their importance in oil palm breeding.
Association analysis
SNPs detected by GBS have allowed the genetic dissection of genotypic variation in complex traits in many plant species [34–36]. Specifically, in oil palm the GBS technique has been used for identifying candidate genes related to oil bunch [37], bunch weight [20, 37], stem height [21] and oil quality traits studies [38] with the enzymes ApeKI and PstI-MspI. Here, we used the enzyme PstI for the library construction, allowing us to identify relatively high number of SNPs with probably novel genetics variants, which according to Chung et al. [39] is one of the most important advantages of GBS.
In the present study, association mapping resulted in the identification of 12 regions related to 10 morphological and yield-related traits (Table 2). However, only five regions associated with LDW, TD, RL, and LXL remained significant (p ≤ 0.05) after the FDR correction. The small LD blocks in the heatmap analysis could suggest that the causal regions are nearby to the most significant SNPs; however, a more comprehensive study must be conducted to determine the actual SNPs and or genes controlling the trait of interest.
Therefore, we describe the five most significant regions and the genes located within those regions which might represent potential candidate genes involved in the expression of the phenotypic traits evaluated in this study. For morphological traits, a significant association was found for LDW on chromosome 3, explaining 10% of the phenotypic variation. The most significant SNP in this region was located in a mechanosensitive (MS) ion channel protein 10-like (MSL10) gene. In plants, MS ion channels have been proposed to play a wide array of roles, from the perception of touch and gravity to the osmotic homeostasis of intracellular organelles [40]. Besides, mechanoperception genes are essential for normal cell and tissue growth and development as well as for the proper response to an array of biotic and abiotic stresses [41]. A second significant region associated with TD on chromosome 15 was identified. Within this region a gene involved in nucleic acid binding and has a C2H2-type Zinc finger domain is located. The C2H2-ZF gene family has been proposed to be involved in the formation of wood, shoot and cambium development in species such as Poplar, as well as playing a role in stress and phytohormone response [42].
For RL and LXL traits, QTLs have been reported on chromosomes 4, 2, 10, and 16 [33]. In our study, three SNPs were associated with three different candidate genes for RL on chromosome 13. The SNP S13_20856724 is the closest to the AGC3 gene and encodes different G proteins. G proteins had been reported to be involved in a wide range of developmental and physiological processes, having a high potential for yield improvement in crops such as rice [43]. The other significant association was found with the SNP S13_23674227, which is located in an extracellular ribonuclease gene (RNase gene). RNase genes have been studied for years in plants, playing an important role in plant defense mechanism [44] or plant development due to their ability to modify RNA levels, and thereby influence protein synthesis [45]. The SNP S13_25522088 was also significantly associated with RL and LXL, but further studies are necessary to determine their role, if any, in regulating these traits.
Seven SNPs were not significant after FDR correction, possibly due to reduced sample size. The QTL and association studies are limited by the relatively small mapping population sizes resulting in low statistical power, thus rendering small or even medium effect QTL statistically non-significant and difficult to detect. Such statistically underpowered populations may also suffer from severe inflation of effect size estimates (so-called Beavis effect) [46]. Increasing the population size and marker density is required to enable unbiased beavis effect estimation and to achieve higher statistical power [46–48], yet for perennial populations (long generation time) and limited offspring numbers this would require a considerable investment.
Although these seven regions were non-significant after FDR correction, three of them were located near or alongside genes whose function could be related with the expression of the traits under study. The first of these regions for HT harbors the SNPs S15_22553489 and S15_22553493 located at the STYK gene, which is involved in the control of stomatal movement in response to CO2 [49]. Recent studies also showed the role of STYK in stem diameter by increasing the number of xylary fibers in species such as Bambusa balcooa [50]. Moreover, Pootakham et al. [20] found a region that controls the phenotypic variance of HT near this region on chromosome 15.
The other two regions were observed on chromosomes 5 and 10 for yield-related traits BN and BW. The gene p5.00_sc00003_p0367 potentially associated with BN, codes for a cation/H(+) antiporter. Antiporter proteins function as regulators of monovalent ions, pH homeostasis, and developmental processes in plants [51]. On chromosome 10, the gene p5.00_sc00004_p0097, potentially associated with BW, encodes a zinc finger protein (ZFP). The ZFP is a large protein family, involved in plant development, regulation of plant height, root development, flower development, seed germination, secondary wall thickening, anther development, and fruit ripening [52]. Studies conducted by Wu et al. [53] demonstrated that silencing a gene related to ZFP hampered fruit development in Nicotiana benthamiana [53]. This ZFP gene might play an important role on yield-related traits in oil palm, as shown in other plants, where overexpression of zinc finger proteins are related with higher yields [54].
In oil palm, harvesting of fruit bunches after a certain age is an arduous labor due to their tallness. For this reason, genotypes with less HT and TD are preferred among oil palm farmers. Likewise, larger foliar (RL and LDW) is related to a greater photosynthetic production which could be involved in higher productivity. But most importantly, increasing the number and weight of fruits means higher productivity per palm and therefore higher incomes for farmers. For this reason, leveraging QTLs or genes related to these traits, as the ones we identify in this study, could contribute to develop plant breeding strategies such as marker-assisted selection, that helps to select promising accessions in earlier stages (greenhouse conditions), and therefore, reduce the breeding cycle. Further work needs to focus on the biological functions of the set of potential candidate genes found in our research, yet the correlations identified in association studies cannot be dubbed as causations.