Genomic and Proteomic Comparative Analysis of SARS-CoV-2 versus SARS-CoV-GD01

Using bioinformatics analysis, the whole genome of SARS-CoV2 emerging in 2020 and its deduced proteome were compared with the corresponding information on SARS-CoV-GD01 having emerged in 2003 in China. The genomes squences of the two viruses were obtained from NCBI. Alignment of protein sequences for all genes of both genomes were performed and displayed using Clustal Omega data base.


Results
Coronavirus name is derived from the latin word corona (crown), referring to the shape of proteins around the virion. SARS-CoV-2 is an ssRNA positive-strand viruses, its genome is 29903 b ss-RNA. It belongs to family of Coronaviridae Genus Betacoronavirus. The natural hosts of SARS-CoV-2 are human vertebrates, animals and bats. The primary site of infection is epithelial cells of respiratory system or enteric tracts.
There are many viruses infect humans such as SARS-CoV, MERSCoV, HKU1, NL63, OC43, 229E and nally SARS-CoV-2 (Corman et al. 2018). SARS-CoV-2 interacts with ACE2 (Angiotensin-converting enzyme 2) cell receptors (Wan et al. 2020;Wrapp et al. 2020). SARS-CoV-2, was further con rmed to have a high a nity to bind human ACE2 and uses it as an entry receptor to invade target cells (Walls et al., 2020) concluding that the Cryo-EM structures of the SARS-CoV-2 spike glycoprotein as well as the inhibition of spike-mediated entry by SARS-CoV polyclonal antibodies, may provide a blueprint for the design of vaccines properly speci c for this virus.
SARS-CoV-2 which emerged in China and spread worldwide in a short period of time is mainly associated with COVID-19 diseases or respiratory diseases (pneumonia), with cardiac complications and transmits through respiratory droplets. There are two global hypotheses regarding the origin of SARS CoV-2. The rst one is through natural selection in an animal or humans host and the second one assumes that its emergence through laboratory manipulation of a related SARS-CoV-like coronavirus.
As both of SARS CoV-2 emerging in 2019 and SARS-CoV-GD01 emerging in 2003 are endemic in China, exploring the similarities and dissimilarities between the two through the comparative bio-informatics analysis of genomic and proteomic data may cast light on the volume and identity of the evolutionary changes in the new version allowing to understand its speci cations and boundaries and thus paving the way for knowledge-based remedies and drugs while also bene tting from the wealth of information on the old version. We will show the perspectives on the interconnected features of the SARS-CoV-2 and SARS Coronavirus GD01 genomes, which may promote our understanding of the new version helping us to conclude the potential pathway leading to the new version, i.e.; through natural selection or laboratory genetic .manipulations. Qualifying and quantifying the speci city and the volume of genomic and proteomic alterations between the two versions during 16 years will not only enable designing right drugs and strategies of confronting the current viral version, but it may rather allow to extrapolate and foresee potential outbreaks of newer versions during the coming decades.

Alignment of SARS-CoV-2 and SARS CoV GD01 biological sequences
Alignment of protein sequences for all genes of both genomes were performed and displayed using Clustal Omega data base (https://www.ebi.ac.uk/Tools/msa/clustalo/) to identify the potential regions of similarity, indicating probably functional, structural and/or evolutionary relationships between two biological sequences, and showing the conserved and variable regions especially the regions of nucleotide insertion and deletion. This will ideally re ect the most evolutionally events having occurred in viruses. Comparison of number of genes and deduced proteins of both genomes were done to compare the whole proteome features which are considered as the puzzling informative base of virus strength and virulence and behavior.

Genomic characteristics of SARS-CoV 2 and SARS-CoV GD01
The genomic analysis of SARS-CoV GD01 genome (29757b RNA) revealed its constitution of 11 genes encoding for 12 proteins (Table 1)    In fact insertion or deletion of one or more nucleotides to a genomic sequence may shift the way the sequence is read in both genomic and mRNA codons. All insertions and deletions frequencies were mentioned in Table 2. For instance, if one nucleotide is deleted from RNA genomic sequence, a disruption reading frame may occur including the mutation site and the following region. This may apparently lead to the creation of new combinations of many incorrect amino acids sequences forming the corresponding protein. In contrast, if the inserted or deleted nucleotides were three, then no change in the reading of mRNA codons will occur; however, there will be either one extra or one missing amino acid in the nal protein. Therefore, insertion or deletion mutations produce different proteins with incorrect amino acids. It may be rather assumed that all above mentioned changes could normally have occurred in coronaviruses as normal evolutionary events, supporting perhaps the rejection of laboratory manipulation hypothesis. However, the big total number of insertions (168) and deletions (240) may refer to a new strain of the virus having completely new features that need to be carefully and thoroughly studied to enable competent viral management tools and curing methodologies. With such highly genetic and proteomic alterations in the new version of coronavirus it is becoming urgently demanded to nd new strategies and drugs for the controlling the speedy spreading pandemics.

Discussion
The new noticed extra amino acids, added to SARS-CoV 2 proteome especially in spike (S) glycoprotein could probably stand behind the emergent new features of this virus including its capability of binding to human cell receptors. The novel genomic characteristics of SARS-CoV-2, presented here may partly pinpoint the genomic structural changes responsible for the witnessed severe viral infectivity and the unprecedented extremely high transmissibility of this virus in humans worldwide. Although the analysis shows that SARSCoV-2 is not a manipulated virus, the available data do allow to totally discard the second hypothesis on the viral origin. Hence, more scienti c research is needed on other viral isolate to discern unequivocally and unambiguously the real origin of the virus. Currently, we are facing a critical situation where there is no speci c antiviral treatment recommended for COVID-19, where a long time is required to develop speci c vaccine against SARS-CoV-2 and where the most used treatments for the infected people were symptomatic based principally on oxygen therapy for patients with severe infection or dissolving blood clots. Vaccines promote the body's immune system to e ciently and speci cally attack viruses in its initial complete particle stage, outside the living cells. So, it can protect healthy people from viral infection but it cannot treat infected people. Antivirals can treat infected people but they only inhibit virus development and activity and do not destroy the virus itself. The most di cult obstacle in designing vaccine or antiviral is the viral genetic variation and mutations.
With such highly genetic and proteomic alterations in SARS-CoV-2, it is becoming urgently demanding to nd new strategies and drugs for the controlling and the virus spreading pandemics.
Since the main concept behind designing an antiviral protein is de ning the target viral protein which can be targeted by the antiviral, it may here be di cult since the new virus has nearly altered most of its proteins so the previous antivirals designed for the previous strains of corona virus have come out-ofservice. New drugs should be designed prepared and tested in vitro and animal then validated in human clinical trials. This is all time and effort consuming but indispensable. One of the antiviral strategies is producing some factors which are similar to viral proteins attaching factors and thus they can bind to the host cell membrane and prevent the viral attachment or they can bind to viral protein if they were similar to the host cellular factors thus blocking its communication with the cells. This strategy of designing drugs can be very expensive and time consuming but it is necessary. Stabilizing the virus at the replication stage by developing nucleotide or nucleoside analogues that can interfere with the viral ampli cation and replication process. But these drugs depends also the genetic characters of the viral RNA sequences. Then these sequences have undergone major changes the old remedies will no more be of use against the new version of corona viruses.
The other mechanism of counteracting virus is through stimulating the body's immune system to attack a range of pathogens, .e.g. interferons, inhibiting viral synthesis in infected cells (Samuel, 2001). However, viruses can become resistant through spontaneous mutations. A deletion at amino acid position 245-248 in the neuraminidase gene of in uenza A virus subtype H3N2 occurred after initiation of treatment with oseltamivir highly reduced its inhibition against oseltamivir (Trebbien et al. 2018). The most commonly used method for treating resistant viruses is combination therapy, which uses multiple antivirals in one treatment regimen. This is thought to decrease the likelihood that one mutation could cause antiviral resistance, as the antivirals in the cocktail target different stages of the viral life cycle (Moscona, 2009).
At the start of the COVID-19 epidemic control most treatments were mainly symptomatic. Due to the lack of e cient and speci c treatments and the need to contain the epidemic, some of the old antiviral or general drugs have been resorted to; e.g. chloroquine, remdesivir, lopinavir, ribavirin or ritonavir and teicoplanin (Baron et al. 2020 termination. Chloroquine has multiple mechanisms of action. Chloroquine can inhibit a pre-entry step of the viral cycle by interfering with viral particles binding to their cellular cell surface receptor and it can inhibit quinone reductase 2 . Virus may also develop new resistance of these new substances. To totally avoid the viral genomic and proteomic alterations which enable viruses to escape the natural and development immunity, another pathway may be potentially effective after receiving the due research. This approach represents the basic proteins and peptides which have been con rmed antibacterial active then few studies proved their effectiveness against viruses. These proteins can be found natively available e.g. lactoferrin or can be chemically prepared by esteri cation which neutralizes the negatively charged carboxyl groups of the aspartyl and glutamyl residues on protein molecules, transforming the protein net charge into positive (Sitohy et al. 2000).   ). Globally, these results suggest the wide-spectrum speci city of these chemically modi ed proteins against different virus and pathogenic bacteria nominating them as potential effective candidate in treating Covid-19 and other epidemic viral outbreaks. They can be prepared from many available native proteins, their properties can be controlled and well designed and they have been primarily proven non-toxic (Sitohy et al. 2013). Nevertheless, further pharmacological and pharmaceutical studies are required to de ne the best treating approach with due insight into the potential mechanism and the due requirements to get the best antiviral action of these substance against SARS-CoV2.

Conclusion
Evolution changes in viruses by insertion and deletion of nucleotides occurred normally in Coronaviruses as normal evolution events producing different viral proteins with multiple incorrect or altered amino acids. Our analyses clearly show that SARS-CoV-2 has been molecularly developed from SARS Coronavirus GD01 after major alterations in the viral genes and their translated proteins. We strongly concluded that variable regions in SARS-CoV-2 genome especially in orf1ab, spike and ORF10 genes must be used in molecular diagnosis of this virus and may be the target of designing speci c antiviral drugs. The genomic and proteomic alterations in the virus necessitate the search for new remedies and vaccines.
The Quali ed and quanti ed speci c genomic and proteomic alterations between the two versions of the coronaviruses during 16 years will not only enable designing right drugs and strategies for confronting the current viral version, but it may rather allow to extrapolate and foresee potential outbreaks of newer versions during the coming decades and thus be prepared beforehand.
Since developing speci c vaccines or other speci c antivirals requires long time, new antiviral drugs of non-speci c character should be developed to be used in the time of epidemics and pandemics.
Positively charged proteins and their peptides, can be isolated from natural sources or chemically prepared, can be a good choice to confront globally spreading SARS-CoV2 and other epidemic viral outbreaks, based on their wide-spectrum speci city against virus and pathogenic bacteria, and their health safety. Yet, extensive pharmacological and pharmaceutical studies are critically and urgently needed for best medical practices against SARS-CoV2 the causative virus of Covid-19 disease.