SARS-CoV-2 has 16 non-structural proteins (nsps), four structural proteins and 6 accessory proteins.
Nsp1
The viral genomic RNA is translated by the host cells to produce non-structural proteins (nsps). The nsps are responsible for maintaining the cellular conditions favorable for viral infection and viral mRNA synthesis. The SARS-CoV-2 non-structural protein 1 (nsp1), is considered the host shutoff factor, and is known to suppress host innate immune functions (Schubert et al. 2020). Nsp1 binds to the host ribosome thereby interfering with mRNA binding. The nsp1 is crucial for mediating translation inhibition. In addition, nsp1 also inhibits interferon signaling (Vazquez et al. 2021). The gene sequence of nsp1 is shown in Fig. 1.
Nsp2
Nsp2 is involved in translation repression, endosomal transport, ribosome biogenesis, and actin filament binding (Zheng et al. 2021; Gupta et al. 2021). The Nsp2 protein structure consists of three zinc fingers (ZnFs). The large area of positive charge on the surface demonstrates the ability of nsp2 binding with nucleic acids and regulating intracellular signaling pathways (Ma et al. 2021). The gene sequence of nsp2 is shown in Fig. 2.
Nsp3
Nsp 3 is a transmembrane protein and is the largest protein encoded by the SARS- CoV-2 genome (Thomas, 2021). Nsp3 is a factor involved in viral replication and function as a protease. Encoded within nsp3 is the papain-like protease (PLproCoV-2) that cleaves nsp1, nsp2 and nsp3 (Klemm et al. 2020; Armstrong et al. 2021). Nsp3 contains several macrodomain folds to prevent host innate immunity. The CoV-2 Mac1 domain possesses mono (ADP-ribosyl) hydrolase activity in vitro, reversing PARP14 modifications and is proposed to remove single ADP-ribose modifications from host protein substrates in cells. Viral macrodomains are believed to counter or hijack host immunity by reversing the mono (ADP-ribosyl) modifications generated by host PARP14 enzymes, thereby interfering with interferon production and altering STAT1 regulation, a possible link to the damaging and deadly Cytokine Storm Syndrome observed in severe COVID-19 cases (Brosey et al. 2021; Lei et al. 2021). The gene sequence of nsp3 is shown in Fig. 3.
Nsp4
The nsp4 has the largest extra transmembrane domain (loop) among the nsp’s and the second largest extra transmembrane domain of SARS-CoV-2 after the spike protein (Thomas, 2021). Nsp4 anchor the viral replication-transcription complex to modified endoplasmic reticulum membranes. Nsp4 together with nsp3 may play a crucial role during viral RNA synthesis and double membrane vesicle formation (Santerre et al. 2021). The gene sequence of nsp4 is shown in Fig. 4.
Nsp5
Nsp5 is the major protease (3CLpro or Mpro) and along with papain-like protease domain of nsp3, is responsible for cleavage of Orf1ab polypeptides and release of the mature nsp proteins. Nsp5 is responsible for processing nsp4 through nsp16 (Klemm et al. 2020). Inhibition of the nsp5-mediated cleavage would prevent nsp protein production and thereby prevent viral replication. The gene sequence of nsp5 is shown in Fig. 5.
Nsp6
The SARS-CoV-2 replication organelle is made of double-membrane vesicle (DMV) and connectors. The nsp6 is transmembrane protein (Thomas, 2021) and act as an organizer of DMV clusters mediated through lipid droplet-derived lipids. The properly formed nsp6 connectors and lipid droplets are required for the replication of SARS-CoV-2 (Ricciardi et al. 2022). Nsp6 also induce inflammatory cell death in lung epithelial cells (Sun et al. 2022). The gene sequence of nsp6 is shown in Fig. 6.
Nsp7 and nsp8
The SARS-CoV-2 replication/transcription machinery consists of an RNA-dependent RNA polymerase (RdRp, also known as nsp12) working in tandem with nsp7 and nsp8 proteins, forming a RdRp–nsp7–nsp8 supercomplex. Nsp7, nsp8 and nsp12 is involved in polymerase activity. The nsp12-nsp7-nsp8 supercomplex is thus defined as the minimal core component for mediating coronavirus RNA synthesis (Peng et al. 2020). The nsp7-nsp8 dimer could act as a primase for the nsp7-nsp8-nsp12 replication complex (Konkolova et al. 2020). The gene sequence of nsp7 and nsp8 are shown in Fig. 7 and Fig. 8.
Nsp9
Nsp9 is responsible for virulence and viral replication. The dimeric forms of nsp9 increase their nucleic acid binding affinity and the N-finger motif appears to play a critical role in dimerization (de O. Araújo et al. 2021). Nsp9 has been found to interact with RNA-dependent RNA polymerase (RdRp/Nsp12) to form part of the replication and transcription complex (RTC), an essential component for viral replication (El-Kamand et al. 2021). The gene sequence of nsp9 is shown in Fig. 9.
Nsp10
Nsp10 is a conserved stimulator of two enzymes crucial for viral replication, nsp14 and nsp16, exhibiting exoribonuclease and methyltransferase activities (Kozielski et al. 2021). Nsp10 serves as a stimulatory factor for exoribonuclease activity. Disruption of the interaction between nsp10 and nsp14 or inactivation of nsp14 exoribonuclease activity would decrease replication fidelity and accelerate the generation of lethal mutagenesis (Lin et al. 2021). The gene sequence of nsp10 is shown in Fig. 10.
Nsp11
Nsp 11 is the smallest non-structural protein of SARS-CoV-2, with only 13 amino acids. The nsp11 protein is the cleavage product of pp1a polyprotein by 3CLpro/Mpro protease at the nsp10/11 junction (Gadhave et al. 2021). The function of nsp11 is not clearly understood. The gene sequence of nsp11 is shown in Fig. 11.
Nsp12
Nsp12, the viral RNA-dependent RNA polymerase (RdRp), suppresses host antiviral responses. Nsp12 attenuates type I IFN responses by inhibiting IRF3 nuclear translocation (Wang et al. 2021). The gene sequence of nsp12 is shown in Fig. 12.
Nsp13
Nsp13 protein belongs to the helicase superfamily and catalyze the unwinding of double-stranded DNA or RNA in a 5′ to 3′ direction. Nsp13 has been shown to interact with the viral RNA-dependent RNA polymerase nsp12 and acts in concert with the replication-transcription complex (NSP7/NSP8/NSP12). This interaction stimulates the helicase activity of Nsp13. In addition, nsp13 also possesses RNA 5′ triphosphatase activity and is involved in the formation of the viral 5′ mRNA cap (Newman et al. 2021). Nsp13 helicase also suppresses IFN signaling by targeting JAK1 phosphorylation of STAT1 (Fung et al. 2022). The gene sequence of nsp13 is shown in Fig. 13.
Nsp14
Nsp14 is involved in viral replication and in immune surveillance escape. The N-terminal region of nsp14 has an exonuclease (ExoN) domain that cleaves mismatched nucleotides to ensure accurate replication of the viral genome. Coronaviruses have a lower mutation rate than other RNA viruses due to proofreading mechanisms. In addition, for evading immune surveillance, nsp14 is also involved in capping at the 5’ end of the viral RNA genome. The C-terminal region of nsp14 functions as an S-adenosyl methionine-dependent guanine-N7 methyl transferase that is independent of the ExoN activity (Zaffagni et al. 2022). Overexpression of nsp14 induces a near-complete shutdown in host cellular protein synthesis (Hsu et al. 2021). The gene sequence of nsp14 is shown in Fig. 14.
Nsp15
Nsp15, commonly called endoU, is a uridine specific endoribonuclease conserved across coronaviruses. Nsp15 nuclease activity evade activation of host immune responses (Frazier et al. 2021; Pillon et al. 2021). The gene sequence of nsp15 is shown in Fig. 15.
Nsp16
Nsp16, a 2′-O-methyltransferase (2′-O-MTase), is part of the replication-transcription complex. Nsp16 and nsp10 form a protein complex that prevent recognition of viral RNAs by host innate immunity. Nsp16 is involved in the transfer of a methyl group from its S-adenosylmethionine (SAM) cofactor to the 2′ hydroxyl of ribose sugar of viral mRNA and this methylation improves translation efficiency and camouflages the mRNA so as not to be recognized by intracellular pathogen recognition receptors such as IFIT and RIG-I. Importantly, inhibiting or knocking out 2′-O-MTase activity severely attenuates viral replication and infectivity of coronaviruses (Lin et al. 2020; Vithani et al. 2021). The gene sequence of nsp16 is shown in Fig. 16.
S
The structural proteins of SARS include membrane glycoprotein (M), envelope protein (E), nucleocapsid protein (N), and the spike protein (S) (Thomas, 2020). The S protein of SARS-CoV-2 plays a key role in the receptor recognition and cell membrane fusion process. The S protein is composed of two subunits, S1 and S2. The S1 subunit binds to the host receptor angiotensin-converting enzyme 2 (ACE-2), while the S2 subunit mediates viral cell membrane fusion by forming a six-helical bundle via the two-heptad repeat domain (Huang et al. 2020a). Most of the current vaccines to protect against COVID-19 are based on S proteins (Polack et al. 2020; Folegatti et al. 2020; Baden et al. 2021; Sadoff et al. 2021). The S glycoprotein is a focus of vaccine development because it is the primary target of host immune defenses (Bangaru et al. 2020). The gene sequence of S is shown in Fig. 17.
ORF3a
Accessory proteins of SARS-CoV-2 consist of viral proteins whose roles during infection are still not completely understood (Redondo et al. 2021). The accessory protein, ORF3a of SARS-CoV-2 is a transmembrane protein. ORF3a induces cell death through apoptosis, necrosis, and pyroptosis, which leads to tissue damage that affects the severity of COVID-19 (Thomas, 2021; Zhang et al. 2022). ORF3a is a viroporin that could function as an ion channel protein. ORF3a induces cellular innate and pro-inflammatory immune responses that can trigger a cytokine storm, under hypoxic conditions, by activating NLRP3 inflammasomes, HMGB1, and HIF-1α to promote the production of pro-inflammatory cytokines and chemokines (Bianchi et al. 2021; Zhang et al. 2022). The gene sequence of ORF3a is shown in Fig. 18.
E
The envelope protein (E) is the smallest transmembrane structural protein of SARS-CoV-2 (Thomas, 2020). E is a 75-residue viroporin that mediates the budding and release of progeny viruses and activates the host inflammasome (Mandala et al. 2020). E interacts with Zona Occludens-1 (ZO1), one of the key regulators of tight junction formation in infected epithelial cells, and this interaction may contribute, in part, to tight junction damage and epithelial barrier compromise in these cell layers leading to enhanced virus spread and severe dysfunction that leads to morbidity (Shepley-McTaggart et al. 2021). The gene sequence of E protein is shown in Fig. 19.
M
The most abundant structural protein of SARS-CoV-2 is the membrane (M) glycoprotein; it spans the membrane bilayer, leaving a short NH2-terminal domain outside the virus and a long COOH terminus (cytoplasmic domain) inside the virion. The other structural proteins of SARS-CoV-2 can bind to the M protein. The M protein of SARS-CoV-2 is structurally similar to SemiSWEET sugar transport proteins of prokaryotes. The SemiSWEET sugar transporter-like structure of the M protein may be involved in rapid proliferation, replication, and immune evasion of the SARS-CoV-2 virus (Thomas, 2020). The M protein has been reported to suppress host IFN-I production (Sui et al. 2021). M proteins may influence the properties of S proteins and promote the assembly of SARS-CoV-2 viral particles (Boson et al. 2021). The gene sequence of M protein is shown in Fig. 20.
ORF6
The accessory protein, ORF6 is a small protein of approximately 7 kDa, which consists of 61 amino acids. ORF6 proteins antagonize the host innate immune system via the Janus activated kinase 1 (JAK1) and JAK2-signal transducers, and activators of transcription factor, STAT-1. ORF6 inhibits the nuclear transport of PY-STAT1 to suppress primary interferon signaling (Lee et al. 2021; Miyamoto et al. 2022). Orf6 can prevent the nuclear export of host mRNA and further downregulate the expression of newly transcribed transcripts (Li et al. 2022). The gene sequence of ORF6 protein is shown in Fig. 21.
ORF7a
ORF7a is an immunomodulating factor for immune cell binding and triggers dramatic inflammatory responses (Zhou et al. 2021). ORF7a efficiently binds to CD14 + monocytes in human peripheral blood. ORF7a is a key viral factor that contributes to the recruitment of monocytes to infected lungs during COVID-19. ORF7a may suppress the antigen-presenting ability of these monocytes (Zhou et al. 2021). The gene sequence of ORF7a protein is shown in Fig. 22.
ORF7b
The accessory protein, ORF7b promotes expression of IFN-β, TNF-α, and IL-6, activated type-I IFN signaling through IRF3 phosphorylation, and activate TNFα-induced apoptosis (Yang et al. 2021). The symptoms of COVID-19 include heart arrythmias, odor loss, impaired oxygen uptake and intestinal dysfunction. Leucine zippers are involved in heart rhythm regulation through oligomerization of phospholamban in cardiomyocytes. The ORF7b multimerizes through a leucine zipper. ORF7b has the potential to interfere with important cellular processes that involve leucine-zipper formation (Fogeron et al. 2021). The gene sequence of ORF7b protein is shown in Fig. 23.
ORF8
ORF8 is an accessory protein that has been proposed to interfere with immune responses. ORF8 disrupts IFN-I signaling and also down-regulates MHC-I (Flower et al. 2020). ORF8 could activate IL-17 signaling pathway and promote the expression of pro-inflammatory factors thereby acting as a contributing factor to cytokine storm during COVID-19 infection. The gene sequence of ORF8 protein is shown in Fig. 24.
N
SARS-CoV-2 nucleocapsid (N) protein is transmembrane structural and abundant RNA-binding protein critical for viral genome packaging (Thomas, 2020; Cubuk et al. 2021). N protein is a highly immunogenic viral protein that plays essential roles in replication and virion assembly (Thomas, 2022). N exists in a phosphorylated state in the cytoplasm; however, it is predominantly dephosphorylated in mature virions. A major function of N is to encapsidate the ssRNA viral genome to evade immune detection and to protect the viral RNA from degradation by host factors (Wu et al. 2021). The gene sequence of N protein is shown in Fig. 25.
ORF10
The accessory protein, ORF10 suppress the expression of type I interferon (IFN-I) genes and IFN-stimulated genes (Li et al. 2022). Loss of smell and taste are symptoms of COVID-19 and may be related to cilia dysfunction. ORF10 is known to impair cilia function thereby offering a powerful etiopathological explanation for how SARS-CoV-2 causes multiple cilia-dysfunction-related symptoms specific to COVID-19 (Wang et al. 2022). The gene sequence of ORF10 protein is shown in Fig. 26.