Objective: In December 2019 a novel coronavirus (SARS-CoV-2) that is causing the current COVID-19 pandemic was identified in Wuhan, China. Many questions have been raised about its origin and adaptation to humans. In the present work we performed a genetic analysis of the Spike glycoprotein (S) of SARS-CoV-2 and other related coronaviruses (CoVs) isolated from different hosts in order to trace the evolutionary history of this protein and the adaptation of SARS-CoV-2 to humans.
Results: Based on the sequence analysis of the S gene, we suggest that the origin of SARS-CoV-2 is the result of recombination events between bat and pangolin CoVs. The hybrid SARS-CoV-2 ancestor jumped to humans and has been maintained by natural selection. Although the S protein of RaTG13 bat CoV has a high nucleotide identity with the S protein of SARS-CoV-2, the phylogenetic tree and the haplotype network suggest a non-direct parental relationship between these CoVs. Moreover, it is likely that the basic function of the receptor-binding domain (RBD) of S protein was acquired by the SARS-CoV-2 from the MP789 pangolin CoV by recombination and it has been highly conserved.