3.1 Retrieval of the polyproteins and antigenicity
The amino acid sequence of all three (SMN) structural proteins were retrieved from the NCBI database in fasta format. The proteins were investigated for the antigenicity by vaxijen web tool, and it was found that all the three chosen proteins could be good antigens. The default threshold of 0.4 was chosen as the criteria for the antigenicity in the vaxijen tool. The spike protein showed a score of 0.46; Membrane glycoprotein showed a score of 0.51, while nucleocapsid protein showed a score of 0.50. Hence, all three proteins were chosen for further predictions of B cell and T cell epitopes and the construction of the vaccine.
3.2 Prediction of CTL epitopes
CTL epitopes were predicted using Netctl 1.2 server for all the three selected proteins. A total of 38 epitopes was predicted from spike glycoprotein; 10 epitopes were predicted from membrane glycoprotein, while 9 were predicted from nucleocapsid protein. Out of all these predicted CTL epitopes, only 8 were selected for the construction of the vaccine, based on a high binding affinity towards MHC-I, antigenicity, non-allergenicity, and non-toxicity predictions as shown in Table1.
3.3 Prediction of HTL epitopes
HTL epitopes were predicted using the IEDB MHC II server for all the three SMN structural proteins. Finally, 4 HTL epitopes were selected on the basis of binding affinity, antigenicity, non-allergenicity, and non-toxicity, as shown in Table2. Four human alleles and position of predicted epitopes are HLA-DRB1*07:01 (166-180), HLA-DRB4*01:01(298-312), HLA-DRB5*01:01 (232-246), HLA-DRB5*01:01 (345-359).
3.4 B cell epitope prediction
For the prediction of B cell epitopes ABCpred server was used. Based on the binding score (>0.9), non- allergenicity and non-toxicity , a total of four B cell epitopes were finally selected, as shown in Table3.
3.5 Construction of multiepitope based vaccine
The four B cell epitopes, four HTL epitopes and 8 CTL epitopes were selected for vaccine construction, which fulfilled all the criteria of binding affinity, antigenicity, non-toxicity and non-allergenicity. Besides these epitopes, two adjuvants were also added at the N terminal (human beta defensin-3) and at C terminal (human beta defensin-2) of the vaccine for increasing the antigenicity. Adjuvant were linked via EAAAK linkers to the epitopes, HTL epitopes were linked via GPGPG linkers, while CTL epitopes were linked with AAY linkers as shown in Figure 1. The constructed vaccine sequence was again checked for antigenicity, non-allergenic, non-toxicity and it was fulfilling all the criteria.
3.6 Prediction of physicochemical parameters of the constructed vaccine sequence
The physiochemical parameters of the vaccine sequence were predicted by the ProtParam server. The molecular weight of the construct was predicted to be 38.8 KDa, and the theoretical PI value was 9.92. The predicted half-life in E.coli was more than 10 hours, and the instability index in the test tube was found to be stable. The aliphatic value of the vaccine sequence was 58.7 and the Grand average of hydropathicity (GRAVY) was -0.348.
3.7 secondary structure prediction of the vaccine sequence
Secondary structure prediction was made using the CFSSP web tool. The result showed the presence of helix: 44.5%, sheet: 35.6%, and turns: 14 %.
3.8 Tertiary structure prediction of the vaccine sequence
The 3D structure of the multiepitope predicted vaccine was predicted using the Rosetta web tool .It uses de-novo structure prediction using deep neural network algorithm to predict the inter-residue distances as well as orientations. Then these orientations are converted to smooth inter-residue constraints followed by gradient descent energy minimization. Further, coarse-grained models are generated, and full atom refinement is done. It gave 5 best-predicted models, and based on the TM score, one model was selected for further investigation, as shown in Figure 2A. Further to validate the predicted model, Ramachandran plot analysis was done, and results showed that 96.3% residues were in the favourable region, 2.5 % were in the allowed region while ~ 1% were in the outlier region as shown in Figure 2B. Additionally, the PROSA web tool was used to predict the quality of the modeled vaccine , which predicted the Z score of -6.34 . Ramachandran plot and Z score have suggested that the predicted model of protein was valid and could be taken for further analysis.
3.9 Conformational B cell epitope analysis from modelled vaccine
Elipro predicts the antibody epitopes taking protein 3D structure as input. Linear B, as well as discontinuous conformational epitopes, were identified in the vaccine construct using ElliPro, an online server .A total of 8 linear epitopes were predicted, and the sequence of the top 3 epitopes have been reported in Table 4 and has been shown structurally in Figure 2D. And various discontinuous epitope residues were predicted from vaccine sequence length 232-253 ( 21 epitope residues), between 299-357 ( 55 epitope residues ), between 1-54 ( 52 epitope residues), between 69-128 (33 epitope residues ) and between 168-176 ( 9 epitope residues ) were predicted. The individual score of each of the discontinuous epitopes has been shown in Figure 3B.
3.10 Docking of vaccine with TLR-3 receptor
The modeled structure of the vaccine was taken through the energy minimization, equilibration, and MD simulations before docking. The last frame from the simulated trajectory was taken further for docking . The simulated structure has been compared with the crude modeled structure, as shown in Figure 2A. The TLR-3 structure was retrieved from Protein Data Bank (PDB) having ID 1ZIW. The downloaded structure was prepared and processed for docking using the dock prep tool UCSF chimera software [45]. The simulation was done using the PatchDock server and further refined using FireDock. The best-docked complex had global energy of -14.91 Kcal/mol, and attractive Vander wall energy was -18.1 Kcal/mol, which shows a decent binding affinity of the vaccine towards TLR-3. Further, the best binding pose was investigated for polar interactions using discovery studio visualizer [46] between TLR-3 and vaccine, and it was found that GLN352, SER428, ILE370 of TLR-3 was making the hydrogen bond with TYR260, ARG321, and LYS166 of vaccine respectively as shown in Figure 3A.
3.11 MD simulation of TLR-3-vaccine complex
The docked complex was further taken for MD simulations in water using gromacs software. The after minimization and NPT and NVT equilibration, the MD production run was performed for 300 picoseconds to assess the dynamic behavior and stability of the docked complex. The RMSD plot of the trajectory showed a stable complex. After 100 picoseconds, the trajectory was converging with fluctuation between 2.25 Angstrom to 2.5 Angstrom, as shown in Figure 3C.
3.12 Reverse translation and codon optimization
The Java codon adaption tool (Jcat) was used for the optimization of the codon for the proper expression of the protein. E.coli strain k12 was chosen as a host, with additional options such as avoid rho-independent transcription terminators, avoid prokaryotic ribosome binding sites, and avoid Cleavage Sites of Restriction Enzymes. The CAI gives the information of codon usage, generally score between 1 and 0.8, while GC contents should be between 40 % to 70%, values lie outside the given margin is suggested to be inefficient .The codon adaption index (CAI) of the optimized nucleotide sequence of the vaccine was found to be 0.92, with a GC content of 55.6%, which indicates the effective expression of the protein in the E.coli.