Background Convergent and parallel evolution provide unique insights into the mechanisms of natural selection. Some of the most striking convergent and parallel (collectively recurrent ) amino acid substitutions in proteins are adaptive, but there are also many that are selectively neutral. Accordingly, genome-wide assessment has shown that recurrent sequence evolution in orthologs is chiefly explained by nearly neutral evolution. For paralogs, more frequent functional change is expected because additional copies are generally not retained if they do not acquire their own niche. Yet, it is unknown to what extent recurrent sequence differentiation is discernible after independent gene duplications in different eukaryotic taxa.
Results We develop a framework that detects patterns of recurrent sequence evolution in duplicated genes. This is used to analyze the genomes of 90 diverse eukaryotes. We find a remarkable number of families with a potentially predictable functional differentiation following gene duplication. In some protein families, more than ten independent duplications show a similar sequence-level differentiation between paralogs. Based on further analysis, the sequence divergence is found to be generally asymmetric. Moreover, about 6\% of the recurrent sequence evolution between paralog pairs can be attributed to recurrent differentiation of subcellular localization. Finally, we reveal the specific recurrent patterns for the gene families Hint1/Hint2, Sco1/Sco2 and vma11/vma3.
Conclusions The presented methodology provides a means to study the biochemical underpinning of functional differentiation between paralogs. For instance, two abundantly repeated substitutions are identified between independently derived Sco1 and Sco2 paralogs. Such identified substitutions allow direct experimental testing of the biological role of these residues for the repeated functional differentiation. We also uncover a diverse set of families with recurrent sequence evolution and reveal trends in the functional and evolutionary trajectories of this hitherto understudied phenomenon.