Background: Orthologs diverge after speciation events and paralogs after gene duplication. It is thus expected that orthologs would tend to keep their functions, while paralogs could be a source of new functions. Because protein functional divergence follows from non-synonymous substitutions, we performed an analysis based on the ratio of non-synonymous to synonymous substitutions (dN/dS) as proxy for functional divergence. We used four working definitions of orthology, including reciprocal best hits (RBH), among other definitions based on network analyses and clustering.
Results: The results showed that orthologs, by all definitions tested, had values of dN/dS noticeably lower than those of paralogs, not only suggesting that orthologs keep their functions better, but also that paralogs are a readily source of functional novelty. The differences in dN/dS ratios remained favouring the functional stability of orthologs after eliminating gene comparisons with potential problems, such as genes having a high codon usage bias, low coverage of either of the aligned sequences, or sequences with very high similarities. The dN/dS ratios kept suggesting better functional stability of orthologs regardless of overall sequence divergence.
Availability: A couple of programs for obtaining orthologs and dN/dS values as tested in this manuscript are available at github: https://github.com/Computational-conSequences/SequenceTools.