protein @ integer. It is a new model for studying proteins. By upgrading protein sequences to protein integers, we strove to decode virus superpandemic and capture superpandemic virus from a new perspective—GCD, unveiling the epidemic evolutionary mechanism.
Firstly, the GCD results explained why SARS-CoV-2 possessed the superpandemic characteristcs of the 1918 flu virus. The enigma was that SARS-CoV-2 ORF8 protein was not only the key antigen, but also a multifunctional modification enzyme, which corresponded to the neuraminidase of the 1918 H1N1 virus8, 9. Both functioned as glycosylation-modified enzymes and RNA base-modified enzymes. Tampering with the enzymes was not found in SARS-CoV and the pandemic 2009 H1N1 virus. Second, for SARS-CoV-2, the key antigens S and ORF8 had the adhesin function like the multidrug-resistant S. aureus ClfA antigen and the S. pneumoniae PspA antigen, respectively10,11,12,13. These adhesin antigens hijacked SIg-JC or SC to enhance immune invasion.This mechanism could explain the T-cell pathology in COVID-19. Although T cells lacked ACE2 and AQP1 (Supplementary Table 6), T cells might be destroyed by the SARS-CoV-2 hijacked ORF8-SIg/SC-CD4 and ORF8-ADAR1-CD4 pathway14, which was similar to that S. pneumoniae hijacked SIg SC to result in functional impairment of CD4+ cells.
But how did the SARS-CoV-2 ORF8 protein became a multifunctional modification enzyme? The versatility of proteins was related to the self-assembly of proteins. The crystal structure showed that SARS-CoV-2 ORF8 protein had the same self-assembled dimer as GlcNAcase and HexNAcase15, 16, 17. These structures revealed why dimerization was crucial for catalytic activity. The dimeric form of ORF8 protein had been observed in the tobacco BY2 cell expression system18. ORF8 protein from single molecule to functional architecture showed the relationship between protein function evolution and self-assembly. The evolution of ORF8 protein self-assembly further expanded its versatility. Just as gene evolution from scratch had expanded protein diversity19. It will be a new antiviral strategy to regulate the functional evolution of viral protein by controlling its self-assembly. However, the enzymatic transition-state of ORF8 oligomer remained to be elucidated by further experiments.
The GCD results revealed a unified relationship for superpandemic virus replication, immunity and modification. Why was SARS-CoV-2 mutation and evasion so fast? Based on the GCD results, on the one hand, the intrinsic interaction between the accessory protein ORF8 and the nonstructural proteins nsp1, 11, 13 promoted SARS-CoV-2 replication. On the other hand, the accessory protein ORF8 and the nonstructural proteins nsp1, 11, and 13 participated in the dual-cycle of SARS-CoV-2 life cycle and glycan cycle by hijacking SIg-SC and modification enzyme. The SIg SC-hijacked and IgG Fc- glycan hydrolysis reduced antibody-mediated neutralization and opsonization, facilitating the synergistic effect of the accessory protein ORF8 and the nonstructural proteins nsp1, 11, and 13 to enhance virus replication, generate mutations and evasion20. The clinical observation of SARS-CoV-2 confirmed that the deletion of ORF8 gene broke this epidemiological evolution and returned the SARS-CoV-2 superpandemic to a normal pandemic, similar to the deletion of neuraminidase in influenza C virus.
In terms of prediction and early warning, the GCD platform enabled us from virus sequence up to virus GCD to comprehend epidemic principle. Although RaTG13 and RpYN06 are closer to SARS-CoV-2 than Bat-SRBD in sequence similarity, the GCD results support that Bat-SRBD is more likely to be a superpandemic virus than RaTG13 and RpYN06. Previous experiments had also shown that Bat-SRBD was infectious in cultured cells and in mice21. According to the GCD analysis in Table 1, among human CoVs, the results showed that human OC43 CoV S protein and SARS-CoV-2 S protein have significant GCD. Although human OC43 CoV and SARS-CoV-2 S proteins share 29% sequence similarity, the GCD effect reflects the common function of human OC43 CoV and SARS-CoV-2 S proteins22. Similarly, bat HKU3-1 CoV ORF8 and SARS-CoV-2 ORF8 proteins have significant GCD, reflecting the common function of bat HKU3-1 CoV and SARS-CoV-2 ORF8 proteins. These GCD results, on the one hand, were supported by existing experimental studies, and on the other hand, further indicated that recombinant bat HKU3-1 CoV (OC43 CoV S protein + HKU3-1 CoV) or recombinant human OC43 CoV (OC43 CoV + SARS-CoV-2 ORF8 protein) would become a new superpandemic virus.
In summary, it is necessary to establish the superpandemic elements database (SPED) for in-depth research. We would like WHO to organize the WHO@GCD terrace to capture superpandemic strains from different kinds of pathogens, including synthetic and semi-synthetic ones. It is conducive to faster and earlier response to global emerging infectious diseases.
May the suffering end and the world usher in light.