In this study, we identified several SLE-associated loci that are population-specifically under positive selection. The timing of positive selection of some loci were estimated at time frames during or after the Neolithic Transition (< 13 kya); this indicates that both environmental factors and recent human cultural-driven lifestyle possibly played roles in driving selection at these loci. The variants at the SLE risk region 7q11.23 appear to be under positive selection in all analysed human populations. Nonetheless, the different haplotype structures and the occurrence of gene copy number variation at this genomic region pose challenges in identifying the true causal SLE variants and complicate the allocation of the primary target of positive selection. The FST analysis revealed high genetic (FST > 0.25) differentiation for most of the positively selected loci. Thus, positive selection is an important factor influencing the frequency of SLE risk alleles in different human populations. This is in line with two studies reporting positive selection as an important mechanism that can cause elevated population-specific frequencies of autoimmune risk loci [18, 19]. Similar to a previous study, we found that the number of positively selected SLE GWAS loci is higher in Europeans than in Africans [23]. Additionally, we found that no signatures of positive selection on SLE risk SNPs were shared between Africans and Europeans (exception: SLE risk region 7q11.23). Most SLE GWAS, however, analysed mainly individuals with European genetic ancestry, which could result in a bias of selected loci when studying positive selection at the SLE-associated loci.
Almost all SNPs that are under positive selection function as eQTLs in different tissue types (Table 2). This is in line with a recent study which showed – for African and European individuals – that cis-eQTLs were significantly enriched for high iHS values, indicating the importance of regulatory genetic variation in recent human evolution [21, 22]. We also detected that several TF binding motifs are located in adaptive sequences. Thus, it is not fully clear whether the potential SLE risk genes are per se under positive selection, or whether this pertains to eQTLs and TF binding motifs that regulate the expression of other genes. The adaptive eQTLs influence gene expressions mainly in tissues such as the thyroid, brain, skin, whole blood, adipose, oesophagus and muscle, suggesting that these tissues were the primary targets of recent genetic adaptation. Moreover, few positive selected SNPs that function as eQTLs also influence genes that are involved in immunity such as defensin beta 134 (DEFB134), APAF1 interacting protein (APIP) and neutrophil cytosolic factor 1 (NCF1). The latter gene, which is located in very close vicinity to GTF2I locus, has been reported to be associated with SLE in Japanese populations [24]. Furthermore, while some adaptive eQTLs affect the expression of multiple genes, the majority affect – as opposed to non-adaptive eQTLs – significantly fewer genes and tissues. This finding is in line with [25], who reported that adaptive eQTLs tend to affect fewer tissues than non-adaptive eQTLs. These results suggest that genetic adaptation reduces the effect of pleiotropy.
Purifying selection is expected to purge deleterious genetic polymorphisms from a population’s gene pool. Importantly, our findings show that risk variants are retained in the human genome. This somewhat contradictory result is attributable to the fact that phenotypic adaptations (i.e., epigenetics and eQTLs) can occur faster than genotypic adaptations. These changes entail the potential of a population to adapt more rapidly to environmental stimuli via modification of gene expression without having to alter the genetic code. Moreover, for a genetic polymorphism to be subjected to negative selection, it must exert a drastic harmful effect in terms of reducing individual fitness. SLE mainly affects women of reproductive age and this disease therefore clearly reduces the fitness of affected individuals. From an evolutionary perspective, the question arises as to why the underlying alleles have not been driven to low frequencies if SLE reduces survival and reproduction chances? Pathogen-mediated positive selection is thought to drive genetic adaptation at loci involved in immune responses. However, recent lifestyle changes during as well as after the Neolithic Transition may have become selectively effective (e.g., as shown for dairy farming and the associated genetic adaptation of the lactase gene).
Our study found a very recent timing of positive selection (2.2 kya – 3.2 kya) for the beneficial allele rs10774625-A in Europeans at the SLE risk locus SH2B3-ATXN2. The protein encoded by SH2B3 influences a variety of signalling pathways mediated by Janus kinase (JAK) and receptor tyrosine kinases (RTKs), acts as a negative regulator in cytokine signalling, and is involved in growth factor signalling activities [26]. The gene ATXN2 lies very close to this gene. It encodes a polyglutamine protein involved in RNA metabolism and metabolic homeostasis [27]. The SH2B3-ATXN2 region was found to be associated with various other diseases such as rheumatoid arthritis [28], type 1 diabetes [29] and coronary artery disease [30]. Our study shows that the SH2B3-ATXN2 region is exclusively under strong positive selection in Europeans, as reported in two previous studies [31, 32]. SH2B3 has been suggested to play a role in protecting against bacterial infection because a risk allele for celiac diseases was found to be associated with activation of the NOD2 recognition pathway. This could potentially explain the selective sweep at this locus [33]. Importantly, we recorded signatures of positive selection across the genes SH2B3-ATXN2 (Additional file 4) which are also characterised by strong LD (Additional file 5); their close proximity on the chromosome hinders determining which of these two genes is actually under positive selection. Either one of the genes or potentially both are under positive selection; another possibility is that the identified eQTLs within this region are under positive selection because they control the expression of the downstream ALDH2 gene (Additional file 5 shows that SH2B3-ATXN2 and ALDH2 are located in the same extended LD block in Europeans). This suggests that the ALDH2 gene is the primary target of positive selection in Europeans. The encoded protein of this gene, which functions in the mitochondrial matrix, belongs to the aldehyde dehydrogenase family of proteins and represents the second enzyme of the major oxidative pathway of alcohol metabolism; it plays a key role in oxidising acetaldehyde into nontoxic acetate. The alcohol flush reaction is predominantly shown by people of Asian ancestry carrying an ALDH2*2 allele, i.e., carrying the inactive isozyme with limited activity to convert acetaldehyde into acetate. On contrast, people of Caucasian ancestry usually have only the active isozyme (ALDH2*1 allele). The positively selected allele rs10774625-A as well as other positively selected SNPs in this genomic region in Europeans function as eQTLs and are associated with increased ALDH2 expression. We therefore conclude that the culturally driven European lifestyle involving high alcohol consumption is the driving force for positive selection at this locus. The result is an increasing frequency of the SLE risk allele in Europeans. The frequency (in the 1000 Genomes populations) of the beneficial allele rs10774625-A in Europeans is about 0.48, whereas it is virtually absent in East-Asians. In Africans the allele frequency at this locus is 0.02; in South-Asians about 0.07 (Additional file 1). Finally, Neanderthal and Denisovan sequences show the ancestral allele (G) at this SNP.
How is this beneficial allele possibly liked to SLE susceptibility in Europeans? ALDH2 is not only a major detoxification enzyme for ethanol-derived acetaldehyde but also mediates the activation of nitric-oxide synthases (NOS), a family of enzymes catalysing the production of nitric oxide (NO) [34]. NO is involved in various vital physiological processes including host defence. However, NO can also form peroxynitrite (ONOO−), reactive NO species (RNOS) exerting various proinflammatory actions [34, 35]. Acute or chronic exposure to ethanol is apparently associated with increased RNOS through induction of NOS expression, protein nitration and lipid oxidation. Studies indicate that increased inducible NOS activity is associated with the progression of SLE [36]. Other studies, however, indicate that moderate alcohol consumption is associated with decreased SLE risk [37, 38]. We propose that the beneficial allele associated with increased ALDH2 expression has positive effects on ethanol detoxification (after increased alcohol consumption) and NOS activation. Nonetheless, in SLE progression, increased inducible NOS activity (potentially due to oxidative stress) limits ALDH2 activity, possibly leading to (sex-specific [39]) increased formation of RNOS and reactive aldehydes (such as 4-hydroxynonenal and malondialdehyde). This alters the immune response in SLE. We thus suggest that this risk SNP acts as a ‘promoter’ rather than ‘trigger’ of SLE in Europeans. Furthermore, this SNP is located within enhancer histone marks, and the funMotifs database shows that it also located within the binding sequence for the transcription factor forkhead box protein O1 (foxo1). This TF is involved in regulating glucose metabolism, insulin signalling and regulates metabolic homeostasis in response to oxidative stress. Foxo1 also interacts with NAD-dependent deacetylase sirtuin1 (SIRT1), which has been shown to inhibit T cell activation [40]. Thus, the regulatory elements at this locus, including the effects of positive selection, could contribute to substantial phenotypic/trait differences among individuals with different genetic ancestries.