TANA: efficient approach for predicting protein functions by transferring annotation via alignment networks

doi:10.21203/rs.2.21071/v1

Download PDF

Methodology article

TANA: efficient approach for predicting protein functions by transferring annotation via alignment networks

https://doi.org/10.21203/rs.2.21071/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this older preprint version

Read the latest preprint version →

Background: One of the challenges of the post-genomic era is to provide accurate function annotations for orphan and unannotated protein sequences. With the recent availability of huge protein-protein interactions networks for many model species, the computational methods revealed a great requirement to elucidate protein function based on many strategies. In this respect, most computational approaches integrate diverse kinds of functional interactions to unveil protein functions by transferring annotations across different species by relying on similar sequence, structure 2D/3D, amino acid motifs or phylogenetic profiles.

Results: In this work, we introduce a new approach called TANA for inferring protein functions. The main originality of the introduced approach stands on the function prediction for the unannotated protein by transferring annotation via a network alignment as well as from the direct interaction neighborhood within their PPI networks. Doing so, we are able to discover the functions of proteins that could not to be easily described by sequence homology. We assess the performance of our method using the standard metrics established by the CAFA and highlight a sharp significant improvement over other competitive methods, in particular for predicting molecular functions.

Conclusions: This research is one of the first attempts that combine sequence and networks-multiple-alignment-based function prediction approaches. We have been able to assess the accuracy of the prediction using pairwise and multiple alignment of the PPI networks for the compared species. Therefore, we recommend using different strategies (i.e pairwise, multiple, with/without neighborhood networks) especially in situations where the functions of the protein are not known in advance.

Bioinformatics

PPI

networks

alignment

function prediction

neighborhood

gene ontology

computational assessment of function annotation