The comprehensive bioinformatics analysis was conducted using prediction software to predict (i) T- and B-cell epitopes, (ii) the secondary and tertiary structures, and (iii) the antigenicity of fusion protein.
2.1. The protein sequence retrieval and designing the construct (S1-NTD-HA2-IFNɣ)
The full-length open reading frame of the Spike_S1_NTD gene (YP_009724390) from SARS-COV-2 without its signal peptide sequence and the TAA stop codon was fused in-frame to HA2 gene (YP_009118626.1) from Influenza A virus/California/07/2009(H1N1) and also Human IFNɣ gene (NM_000619) by two (G4S)3 linker. The 3́ end of the fusion gene includes a six-histidine tag for easier purification.
2.2. B-cell and T‑cell epitope prediction and Confirmation of Antigenicity
A variety of freely online accessible servers like ABCpred, BepiPred, BCPred, and IEDB (Immune-Epitope-Database And Analysis-Resource) were used to predict which a lot of B cell epitopes of the fusion protein. The online prediction tools IEDB, SYFPEITH, PropredI, Propred, Net MHC, Net Cytotoxic T lymphocyte (CTL), CTL Pred were assessed for their ability to predict the T-cell epitopes. All predicted epitopes were detected as probable antigen with Vaxijen v2.0 value (threshold 0.4%).
The finally predicted epitopes should be revealed by Immuno proteasome analysis for some dominant enzymes of host cells in order to prevent degradation of peptide during antigen processing. So, the Protein Digest server was used to predict enzymatic degradation sites.
2.3. Prediction of Secondary structure of the fusion peptides, the transmembrane helix and signal peptide
Secondary structure of the fusion peptides was analyzed using the improved self-optimized prediction method (SOPMA) software. This analysis was performed based on the conformational states of the fusion peptides (alpha-helices, beta sheets turns and coils).
In addition, to predict the surface exposed epitopes, the TMHMM (Transmembrane Helices Hidden Markov Models Server v.2.0 was used. The transmembrane helix (TMH) of the construct was predicted using the online TMHMM server v2.0, (Wu and Zhang 2007) and the potential signal peptide (SigP) cleavage site was identified by The SignalP v4.1 (D-cutoff: 0.45) online tool.
2.4. Physicochemical properties and efficiency of the vaccine construct
The post-injection behavior of the designed vaccine into the body is the main goal of vaccination. Therefore, the physicochemical features of the formulated vaccine candidate should be analyzed. Therefore, we used the ProtParam tool of ExPasy web-server.33. In this web-server, various parameters were computed; including (i) molecular weight (kDa), (ii) theoretical isoelectric point (pI), (iii) in vitro and in vivo based estimated half-life, (iv) stability index, (v) aliphatic index, (vi) extinction coefficient, and (vii) grand average of hydropathicity (GRAVY)
2.5. Three dimensional structure modelling
The 3-dimensional modeled structure of final vaccine construct generated by I-TASSER software (https://zhanglab.ccmb.med.Umich.Edu/I-TASSER/). The modelled structure was refined with online server Galaxy refine tool (Ko et al. 2012).The overall model quality was checked using ProSA web-server.
The stereochemistry quality in the vaccine model was analyzed based on the Ramachandran plot. To validate the 3D model, Rampage (Lovell et al. 2003) server was used based on the allowed and disallowed regions of the protein structural model.
2.6. SiteSeer search
The SiteSeer search was managed to find our designed structure itself, to scan the target structure against a prepared database of templates and to match functionally important sites (Laskowski et al. 2005).This raises the possibility to report possible matches after scanning 400 auto-generated templates from the query structure against representative structures in the PDB.
2.7. The BLAST search of our protein sequence vs UniProt
To find a sequence similarity between our protein sequence and the found sequence (hit), the BLAST search (Basic Local Alignment Search Tool) vs UniProt was done (Consortium 2007).
2.8. Nest Analysis
Nests are structural motifs that are often found in functionally important regions of protein structure formed by consecutive enantiomeric left-handed (L) and right-handed (R) helical conformation of the backbone (Pal et al. 2020). Simple Nests are either RL or LR. Larger nests (> 2 residues long) may be RLR, LRL, RLRL, etc., which are composed of simple overlapping nests that have not been studied despite their extensive involvement in protein function. The most abundant doublets and triplets in Nests have a propensity for particular secondary structures, suggesting a strong sequence-structure relationship in the larger Nest(Pal et al. 2020). ProFunc server (http://www.ebi.ac.uk/thornton-srv/databases/ProFunc) was used for predicting probable protein function from 3D structure.
2.9. Docking and Binding Affinity Analysis of Vaccine candidate with TLR-3
To predict the binding affinity of our vaccine candidate and dissociation constant (Kd), the Gibbs free energy (ΔG) as a critical thermodynamic parameter was analyzed by using the PRODIGY web server (Xue et al. 2016). The CASTp web server was used to determine the active binding pockets of refined vaccine construct for TLR-3 receptor (Sharma et al. 2021).
The use of new molecular methods beside a clinical trial can lead to get a better result. Due to the genome organization of SARS-COV-2 virus has been identified; the important enzymatic and structural proteins of this virus have been recognized. Virtual methods such as molecular docking can be used to identify effective viral therapies. Almost two third of the viral genome was shown to be translated into pp1a and pp1ab polyproteins that cleaved and processed into nonstructural proteins (16 proteins). The crystal structure of designed vaccine and human Toll-like receptor (TLR3) protein was performed from protein data bank web site (http://www.rcsb.org/pdb). Molecular docking investigations were performed by using Molegro Virtual Docker 7 to analyze the interaction probability of this vaccine against TLR3.
2.10. Reverse Translation, Codon Adaptation Index (CAI) and in silico expression of vaccine candidate
The codon adaptation plays an important role in the expression of the desired foreign gene in a different host and is used to adapt the Codon Usage to most sequenced prokaryotic organisms and selected eukaryotic organisms. The CAI-values were calculated by applying an algorithm from Carbone et al.(Carbone et al. 2003). An in-silico cloning and expression of vaccine candidate in E. coli pET–28(+) vector was performed using the SnapGene 4.2 tool to verify the maximum expression of vaccine in expression vector at XhoI and NotI restriction sites.
2.11. The conservation level and cross-protection of designed vaccine
Due to emergence of new strain of SARS-COV-2 and current challenging to control new epidemiological situation derived from its new strains, we analyzed the value of similarity and per identifies of our designed vaccine and surface glycoprotein of SARS-COV-2 that deposited in NCBI.