Retrieval of Whole Genomes and Conserved Region:
Public database Global Initiative on Sharing All Influenza Data (GISAID) was explored to retrieve the whole-genomes of novel SARS CoV–218. Sixty-seven (67) whole-genomes of SARS CoV–2 were selected for this present study (January, 2020). In quest for developing universal therapeutics, multiple sequence alignment and construction of phylogenetic cladogram were performed using neighbor joining method in CLC Drug Discovery Workbench (Version 3.0). For further study, consensus sequences were collected to develop a vaccine, small molecules and SiRNAs.
Chimeric Vaccine against SARS-CoV–2
Antigenicity analysis of the candidate proteins
The antigenicity of the selected Membrane Glycoprotein (NCBI accession:YP_009724393.1) and Spike Glycoprotein (NCBI accession: QHR63250.2) were determined by Vaxijen 2.018 and AntigenPro19. These servers predicted antigenicity through alignment independent methods.
Identification of B cell and T cell Epitopes
The outside amino acid residues of the vaccine candidates were identified with THMM19. The identified sequences were uploaded in BepiPred–2.0 server to select the most potential B cell epitopes21. During this selection, surface exposed amino acids were prioritized mostly. The Cytotoxic T cell (CTL), Helper T cell (HTL) epitopes were disclosed by NetCTL 1.2 and NetMHC 4.0 respectively22,23. NetCTL 1.2 was implemented to discover the CTL epitopes for 12 classes of Major Histocompatibility Complex 1 (MHC I) supertypes. The best CTL epitopes were selected based on combined score. NetMHCII 2.2 was used to detect 15-mer HTL epitopes for human HLA-DP, HLA-DQ, and HLA-DR alleles. Most potential epitopes were chosen by evaluating affinity, percentage ranking and binding level.
Construction of Chimeric Vaccine
To construct a chimeric or multi-epitope vaccine, all the epitopes were joined with EAAK, GPGPG and AAY linkers24. Human Beta Defensin–2 (HBD–2) (PDB ID: 1FD3) and a recombinant viral protein were added in the N terminal and the C terminal of the vaccine respectively25,26. HBD–2 was conjugated because the protein can activate Toll Like Receptor 4 (TLR 4) and have chemotactic activity25,27. The recombinant viral protein was added to stimulate the antiviral responses26.
Evaluation of Immune Response and Interferon Gamma (IFNγ)
The fasta sequence of the vaccine was uploaded to an agent-based immune system simulator C-ImmSim (http://150.146.2.1/C-IMMSIM/index.php) for measuring the immune responses28. The parameters were kept default during this simulation. C-ImmSim showed an adequate secretion of IFNγ which was further evaluated by IFNepitope29.
Allergenicity and Toxicity Exploration
Recombinant vaccine can initiate allergic response or lead to various types of toxicity. Therefore, evaluation of toxicity and allergenicity is an essential step for vaccine design. Allergenicity of the vaccine was calculated by AlgPred and AllerTop v.230,31. Toxicity was measured by ToxinPred32.
Analysis of Physicochemical Properties and Tertiary Structure
The physicochemical analysis of the protein was executed via ProtParam32. The tertiary structure was generated through Contact-guided Iterative Threading ASSEmbly Refinement (C-I-TASSER) (17). The structure was refined by GalaxyRefine35 and validated with Procheck36 and ProSAWeb37.
Molecular Docking Analysis
The crustal structure of TLR 4 was collected from Research Collaboratory for Structural Bioinformatics (RCSB) Protein Data Bank (PDB) (www.rcsb.org). The PDB ID of the structure was 3FXI. For molecular docking analysis, only Chain A of TLR 4 was preserved and prepared by deleting the heterogeneous atoms. This preparation was executed by BIOVIA Discovery Studio (https://www.3dsbiovia.com/). To determine the active site pocket of Chain A, the PDB file was uploaded in Computer Atlas of Surface Topology of Proteins (CASTp)38. Finally, the vaccine and Chain A of TLR–4 was uploaded in ClusPro 2.0 for protein-protein docking39.
Codon Optimization and Visualization of Cloning
The vaccine sequence was reversely translated and the codons were optimized for Escherichia coli (strain K12) through JCat40. The optimized DNA sequence was kept free from rho-independent transcription terminators and prokaryotic ribosome binding sites. This sequence was flanked by Nde I and Xho I restriction sites. Stop codons were added at the end of 3’OH or C terminal end. After these preparations, the DNA sequence was inserted in pET28a (+) Plasmid Vector via SnapGene tools (www.snapgene.com).
Small Molecule Therapeutics against COVID–19
Homology Modelling and Binding site analysis
The conserved RNA directed RNA polymerase (RdRp) sequence was retrieve to develop small molecules. C-I-TASSER was employed to build the 3D model of RdRP34. Thereafter, CASTp was applied to identify the drug binding sites that allowed to recognize the critical amino acids for drug-protein interactions38.
Screening the druggable compounds and molecular docking simulation
DrugBank Database (www.drugbank.ca)was utilized to search for the potent drugs against RdRP. Molecular docking of receptorand the selected compounds were performed in AutodockVina to observe the binding affinity into the binding site of RdRp42. Here, AutoDock tools 1.5.6 was used to prepare the input pdbqt file for the receptor. A grid box parameter were set in size with X = 70, Y = 70, Z = 36 points (center grid box:x = 23.448, y = 53.587, z = –2.012, spacing = 0.5˚A). The molecular visualization of protein-ligands were analyzed by BIOVIA Discovery Studio and PyMol43.
Pharmacoinformatics illustration
Osiris property explorer44, Molinspiration45, ACToR (Aggregated Computational Toxicology Resource)46, admetSAR (absorption, distribution, metabolism, excretion, and toxicity Structure-Activity Relationship database)47 and ACD/I-lab48 were exploited for the calculation of Absorption, distribution, metabolism, excretion (ADME) properties and toxicity profile. ADMET properties are necessary to establish an effective drug.
Nucleic Acid Based Therapeutics Development
Designing of potential siRNA molecules
In order to design the effective siRNA molecule, I-Score Designer was employed49. I-Score Designer computes nine different siRNA designing scores such as Ui-Tei50, Amarzguioui51, Hsieh52‚ Takasaki53‚ s-Biopredsi54‚ i-Score55‚ Reynolds56‚ Katoh57‚ and Design of SIRna (DSIR) 58 along with other essential parameters. The server also ranks the best siRNA molecules. From there, the best siRNA sequence was taken for further analysis. The secondary structure of the siRNA was predicted via RNA structure webserver59. RNAfold web server was applied to calculate the free energy of the thermodynamic ensemble for the secondary structures60. Transcription and Translation Tool (http://biomodel.uah.es/en/lab/cybertory/analysis/trans.htm) was implemented to transcribe the viral DNA sequences. Finally, the designated Antisense siRNAs were hybridized against viral RNA sequences using the HNADOCK server61. HNADOCK executed RNA-RNA docking and performed molecular dynamics simulations to refine the best 10 predicted siRNA-mRNA complexes. The model with best docking score for Membrane Glycoprotein mRNA and Spike Glycoprotein mRNA were visualized with PyMol.
Inteferon stimulating genes (ISGs) based Interferon Therapy
Comparative Genomics
The genome of SARS Cov–2 was compared with SARS CoV and MERS CoV. We Blasted SARS CoV–2 against SARS CoV and MERS CoV, using Megablast and Discontiguous Megablast algorithm. The graphical representation of side by side genome comparison is demonstrated in Artemis Comparison Tool62.
Exploring SARS-CoV expression profile
The Gene Expression profile (GSE5972) of SARS-CoV was collected from National center for Biotechnology Information (NCBI). Afterwards, the normalization study was performed in between 10 Healthy samples and 54 SARS patient samples, whereas recovered cases were excluded from this study. We used limma R package to identify the differentially expressed genes63. IDEP tools was utilized to create Hierarchical Clustering Heatmap and Boxplot to the visualization and distribution of the data for both up and downregulated genes64.
Gene Enrichment Analysis
Kyoto Encyclopedia of Genes and Genomes (KEGG) and Enrichr4 were employed to identify the genes which were involved in pathways and Gene Ontology (GO) processes such as biological process (BP), molecular function (MF) and cellular component (CC)65–67. The genes with GO accession ID &KEGG pathways were enlisted for further study. ShinyGO was utilized to construct a clustered tree of the top 30 GO terms for both up and downregulated genes of BP, MF and CC68.
Identification of Viral Genes
We securitized the down and upregulated genes in BP, MF and CC from gene ontology dataset to find out the viral connected genes especially enriched viral production regulation and cytokine regulation.
Exploration of Interferon Stimulating Genes (ISGs)
INTERFEROME69 database was analyzed to recognize the Interferon Stimulating Genes (ISGs) and potential Interferons. Further, we tried to explore the pathway of ISGs which modulate the immune system and interferon regulation by Reactome70.