Background: Long-term balancing selection (LTBS) can maintain allelic variation at a locus over millions of years and through speciation events. Variants shared between species, hereafter “trans-species polymorphisms” (TSPs), often result from LTBS due to host-pathogen interactions. For instance, the major histocompatibility complex (MHC) locus contains TSPs present across primates. Several hundred candidate TSPs have been identified in humans and chimpanzees; however, because many are in non-coding regions of the genome, the functions and adaptive roles for most TSPs remain unknown.
Results: We integrated diverse genomic annotations, with a focus on non-coding regions, to explore the functions of 125 previously identified regions containing multiple TSPs in humans and chimpanzees. We analyzed genome-wide functional assays, expression quantitative trait loci (eQTL), genome-wide association studies (GWAS), and phenome-wide association studies (PheWAS). We identify functional annotations for 119 TSP regions, including 71 with evidence of gene regulatory function from GTEx or genome-wide functional genomics data and 21 with evidence of trait association from GWAS and PheWAS. TSPs in humans associate with many immune system phenotypes, including response to pathogens, but we also find associations with a range of other phenotypes, including body mass, alcohol intake, urate levels, chronotype, and risk-taking behavior.
Conclusions: The diversity of traits associated with non-coding human TSPs further support previous hypotheses that functions beyond the immune system are subject to LTBS. Furthermore, several of these trait associations provide support and candidate genetic loci for previous hypothesis about behavioral diversity in great ape populations, such as the importance of variation in sleep cycles and risk sensitivity.