Identification, selection, and retrieval of Nipah protein sequences. The Nipah Virus proteins were downloaded from the NCBI database (https:// www.ncbi.nlm.nih.gov/). nine protein sequences i.e., nucleocapsid protein (ID: NP_112021.1), P phosphoprotein (ID: NP_112022.1), V protein (ID: NP_112023.1), W protein (ID: YP_007188592.1), C protein (ID: NP_112024.1), matrix protein (ID: NP_112025.1), fusion protein (ID: NP_112026.1), attachment glycoprotein (ID: NP_112027.1), and polymerase (ID: NP_112028.1) were selected for the possible vaccine construction and retrieved from the NCBI database in protein FASTA format. Table (1) lists the proteins sequences with their NCBI accession numbers.
Table 1
ID
|
Protein
|
NP_112021.1
|
Nucleocapsid protein
|
NP_112022.1
|
P phosphoprotein
|
NP_112023.1
|
V protein
|
YP_007188592.1
|
W protein
|
NP_112024.1
|
C protein
|
NP_112025.1
|
matrix protein
|
NP_112026.1
|
fusion protein
|
NP_112027.1
|
attachment glycoprotein
|
NP_112028.1
|
polymerase
|
Cytotoxic T lymphocytes epitope prediction. Production of the vaccine subunits, cytotoxic T lymphocyte (CTL) epitope prediction is very important in immune system stimulation. The NetCTL 1.2 server (https://www.cbs.dtu.dk/services/NetCTL/) was used to predict CTL epitopes [25]. MHC-I binding affinity, proteasomal C terminal cleavage, and TAP (Transporter Associated with Antigen Processing) is a major influencing factor in the selection of epitopes. The default thresholds for the CTL prediction were set as is at 0.05, 0.15, and 0.75 for the parameters such as TAP transport efficiency, proteasomal C-terminal cleavage, and epitope identification. The predicted epitopes were classified by the combined value.
Prediction of Helper T lymphocytes epitope. To predict Helper T lymphocytes (HTL) epitopes, we used The IEDB MHC II server (http://tools.iedb.org/mhcii/) [26]. The species/locus was chosen as Human/ HLA-DR, and a 7-allele human leukocyte antigen (HLA) reference set was selected for the HTL epitopes prediction. By default, the 15-meter length of the epitopes was selected, and generated epitopes were classified according to the percentile value. The percentile rank is given after analyzing the epitope score with 5 million peptides with length 15-mer from the SWISSPROT database, compounds with the highest percentile score show a high affinity of MHC-II.
Prediction of interferon-gamma-inducing epitopes. HTLs filtered from the previous step were undergone further analysis to identify whether they can stimulate interferon-gamma (IFN-γ) immune response by using the (15-mer) IFNgamma epitope server (http://crdd.osdd.net/raghava/ifnepitope/scan.php). Server prediction based on Support Vector Machine (SVM) and epitopes were predicted by selecting IFN-γ versus non-IFN-γ stimulating epitopes [27]. Finally, epitopes with positive results for the IFN-γ response go for the in-silico vaccine construction.
Prediction of linear B‑cell epitopes. B lymphocytes that produce antibodies can be activated alongside stimulation of humoral immune response. We used ABCpred servers (http://www.imtech.res.in/ raghava/abcpred/) [28] to predict epitopes that initiate the humoral immune response to activate B-cells. 16-mer was the window length, based on a recurrent neural network with a 0.51 threshold value, keeping overlapping filter on. Because of a large number of epitopes predicted, only epitopes having a score of more than 0.9 were only chosen for the construction of the candidate vaccine.
Prediction of antigenicity of protein sequences. It is well-known that significant features of vaccine building blocks must have antigenic properties. ANTIGENpro and VaxiJen v2.0 both were used to predict the antigenicity of epitopes. ANTIGENpro (http://scratch.Proteomics.ics.uci.edu/) uses microarray data to calculate protein antigenicity. Its accuracy with the combined dataset was estimated to be 76% based on cross-validation experiments [29]. VaxiJen 2.0 server (http://www.ddg-pharmfac.net/Vaxijen/ VaxiJen/VaxiJen.html) is also a server to predict the antigenicity with the virus as selected target organism and threshold value of ≥0.4, only probable antigen epitopes were selected for the vaccine construction. VaxiJen 2.0 server is based on auto and cross-covariance (ACC) to evaluate the antigenicity of the vaccine. The VaxiJen algorithm is mainly based on the method of sequence alignment and analyzes protein physicochemical properties to identify them as antigenic [30].
Prediction of allergenicity and toxicity of protein sequences. Allergenicity is a vital factor in the development of the vaccine. AllerTOP v.2.0 and AllergenFP servers evaluate the allergenicity of the proteins. AllerTOP v2.0 an online server (http://www.ddg-pharmfac.net/AllerTOP) develops the k nearest neighbors (kNN), auto- and cross-covariance (ACC) transformation, and amino acid E-descriptors machine learning model for the classification of peptides by analyzing the physiochemical properties of proteins. Its accuracy may reach 85.3% at fivefold cross-validation [31]. AllergenFP (http://ddg-pharmfac.net/AllergenFP/) is an alignment-free based on fingerprint approach for the detection of allergens and non-allergens. The fingerprinting approach is based on a four-step algorithm. firstly, the protein sequences are defined in terms of their properties, including size, hydrophobicity, relative abundance, α helix, and β-strand forming propensities. produced strings are then transformed into vectors of equal size by ACC. The vectors were converted into binary fingerprints and measured according to the Tanimoto coefficient. This method's accuracy reaches 88% and Non-allergen epitopes were chosen. Finally, all the epitopes were checked for toxicity using the ToxinPred server (https://webs.iiitd.edu.in/ raghava/toxinpred/multi_submit.php) [32], and non-toxic epitopes were chosen. All epitope that passes the previous analysis is used for vaccine construction.
Construction of multi-epitope vaccines candidate sequence. Selected CTLs, HTLs, and B-cell epitopes generated by NetCTL 1.2, IEDB MHC II server, and ABCpred server respectively, were used to construct the vaccine. The linear B-cell and HTL epitopes were linked with GPGPG linker and CTL epitopes by AAY linker. Also, Three different adjuvants have been used for vaccines: beta-defensin, ribosomal protein, L7/L12 protein, and HABA protein to increase the immunogenicity of the vaccine, and linked via EAAAK linker [23] to generate different combinations of epitopes to produce all possible vaccines construct. Different combinations generate almost 500 vaccine constructs. The vaccines have undergone filtering according to highly antigenic, immunogenic, non-toxic, and non-allergenic scores.
Physiochemical properties and solubility prediction. Protparam, which is one of expasy server tools (https://web.expasy.org/ protparam/) was used to identify physicochemical properties like isoelectric point (pI), amino acid number and composition, half-life, instability and aliphatic index, molecular weight (MW), and grand average of hydropathicity (GRAVY) of the vaccine constructs [33]. The multi-epitope vaccine solubility was predicted using the Protein–Sol server (http://protein-sol.manchester.ac.uk). The population average for the experimental is 0.45, and thus solubility value greater than 0.45 is predicted to have a higher solubility [34].
Secondary structure prediction. PSIPRED and RaptorX servers were used To predict secondary structures of the vaccine constructs. PRISPRED (http://bioinf.cs.ucl.ac.uk/psipred/), is an online server secondary structure prediction tool [35]. RaptorX server (http://raptorx.uchicago. edu/StructurePropertyPred/predict/) was also used to calculate the secondary structural characteristics of the vaccine. The server uses Deep Convolutional Neural Fields (Deep CNF) machine learning model to calculate secondary structure (SS), disorder regions (DISO), and solvent accessibility (ACC) [36].
Tertiary structure prediction. A Three-dimensional (3D) model of the vaccines was generated using the homology modeling GalaxyWEB server (http://galaxy.seoklab.org/). The GalaxyWEB server is a platform for computerized protein structure and function prediction based on the sequence-to-structure-to-function approach and identifies protein structure with high similarity pattern from the Protein Data Bank (PDB) [37]. GalaxyWEB generates 3D atomic models by performing several sequence alignments and iterative structure assembly. GalaxyWEB automatically refined the 3D model obtained for the vaccines. The refinement approach was experimentally approved in CASP10 based refinement experiments. This provides the relaxation of the structure for repacking and molecular dynamics simulation [38].
Validation of tertiary structure. 3D structure validation is a critical stage of the model construction method because it discovers probably disorders in 3D models predicted [39]. ProSA-web server (https://prosa .services.came.sbg.ac.at/prosa.php) was used for protein 3D structure validation, which calculates protein quality, which is shown in the form of Z plot. If the Z scores are outside the range of the properties for native proteins, it specifies that the structure likely contains errors [40]. To investigate nonbonded atom-atom interactions associated with the ERRAT web-server (http://services.mbi.ucla.edu/ERRAT /) was also used to predict high-resolution crystallography structures. A Ramachandran plot was retrieved via RAMPAGE web-server (http://mordred.bioc.cam.ac.uk/~rapper/rampage.php) and describe the quality of the modeled structure by displaying the percentage of residues in disallowed and allowed regions [41].
Prediction of discontinuous B‑cell epitopes. ElliPro, an online tool (http://tools.iedb.org/ellipro/), was used to predict the validated discontinuous (conformational) B-cell epitopes of the 3D structure. ElliPro uses ellipsoid, measure the residue PI, and adjacent cluster residues to calculate the protein shape. ElliPro provides the user with a score for each epitope termed as an average PI value. PI value of 0.9 means that contains (90%) protein residues while the remaining (10%) residues are outside. ElliPro has the top predictions and provided an area under curve (AUC) value of (0.732).
Molecular docking of the final vaccine with the immune receptor. It is based on the interaction between an antigenic ligand and the immune receptor to produce an immune response. The toll-like receptor or TLR3 (PDB ID: 1ZIW) was retrieved from Protein Databank (PDB) (https://www.rcsb.org). Online servers ClusPro 2.0 (https://cluspro.bu.edu/login.php), HADDOCK server (https://haddock.science.uu.nl/), PatchDock server (https://bioinfo3d.cs.tau.ac.il/PatchDock/php.php), and FireDock server (http://bioinfo3d.cs.tau.ac.il/FireD ock/php.php) were used for molecular docking and docking refinement between the vaccine and TLR3 [42]. Again, the docking was performed for the third time using the HawkDock server (http://cadd.zju.edu.cn/hawkdock/), and subsequently, calculate the Molecular Mechanics/Generalized Born Surface Area (MM-GBSA) score using the same server that predicts the result the affinity score and the lowest score is the better score [43].
Molecular dynamics simulation. After filtering of bad docking scores, The molecular dynamics simulation (MD) study was performed for the vaccines of choice that showed the best molecular docking results. The iMODS web-server (http://imods .Chaconlab.org/) was used for the MD study, a fast, free-accessible, and molecular dynamics simulation server for define, calculating the protein flexibility, and deformability [44].
Immune stimulation. C-ImmSim, an online simulation server (http://150.146.2.1/C-IMMSI M/index.php) predicts the vaccine constructs immune response. C-ImmSim determines the humoral and cellular immune response to the vaccine construct [45]. Three injections of the target vaccine Nipah virus vaccine were administered at different intervals of 4 weeks. All simulation parameters were established at default parameters with periods set at 1, 84, and 168. The volume of simulation and the steps of the simulation were set at 50 and 1000, respectively. And the random seed was 12345 with vaccine injection.
Codon optimization and in silico cloning. A codon optimization approach was adopted to improve the expression of recombinant proteins. Optimization of the codon is essential because genetic code degeneration allows for the encoding of most amino acids by multiple codes. Java Codon Adaptation Tool (JCat) The E.coli (strain K12) codon system was used with the server (http://www. prodoric.de/JCat) to determine protein expression levels by acquiring the CAI values and GC content. The optimum CAI value is 1.0, while>0.8 is somehow a good score, and the GC content range from 30 to 70% is also good because translation and transcriptional efficiency have unfavorable effects across this range [46]. The optimized gene sequence of the vaccine has been cloned in E.coli. The N and C-terminals for the sequence were each added to the E.coli plasmid vectors pET-30a (+) considering NdeI and HindIII restriction sites. Finally, the optimized sequence of the final vaccine construct (with restriction sites) was inserted into the plasmid vector pET-30a (+) using the SnapGene software (https://www. snapgene.com/free-trial/) to confirm the expression of the vaccine.
Table 2: the candidate proteins physiochemical analysis
Protein Name
|
Antigenicity Score
|
Antigenicity
|
Length
|
Allergenicity
|
Half-life time
|
GRAVY Score
|
The instability index (II)
|
Molecular weight
|
Theoretical pI
|
Nucleocapsid protein
|
0.5713
|
Antigen
|
532
|
NON-ALLERGEN
|
30 h
|
-0.236
|
52.33
|
58168.07
|
6.06
|
P phosphoprotein
|
0.5866
|
Antigen
|
709
|
NON-ALLERGEN
|
30 h
|
-0.730
|
48.52
|
78302.51
|
4.61
|
V protein
|
0.6252
|
Antigen
|
456
|
NON-ALLERGEN
|
30 h
|
-0.816
|
60.47
|
50325.44
|
4.66
|
W protein
|
0.6199
|
Antigen
|
449
|
NON-ALLERGEN
|
30 h
|
-0.827
|
57.56
|
49464.67
|
4.84
|
C protein
|
0.3300
|
Non- Antigen
|
166
|
NON-ALLERGEN
|
30 h
|
-0.345
|
43.67
|
19735.59
|
9.44
|
matrix protein
|
0.4033
|
Antigen
|
352
|
NON-ALLERGEN
|
30 h
|
-0.211
|
29.53
|
39928.28
|
9.31
|
fusion protein
|
0.5012
|
Antigen
|
546
|
NON-ALLERGEN
|
30 h
|
0.195
|
38.00
|
60281.96
|
5.85
|
attachment glycoprotein
|
0.5110
|
Antigen
|
602
|
ALLERGEN
|
30 h
|
-0.178
|
34.56
|
67039.03
|
8.58
|
polymerase
|
0.4757
|
Antigen
|
2244
|
NON-ALLERGEN
|
30 h
|
-0.286
|
41.87
|
257232.51
|
7.53
|
Table 3: List of the CTL epitopes selected with all the antigenicity, non-allergenicity, and non-toxicity criteria.
Peptide
|
MHC binding affinity
|
Rescale binding affinity
|
C-terminal cleavage affinity
|
Transport efficiency
|
Prediction score
|
Antigenicity
|
Toxicity
|
Allergenicity
|
ETDDYNGIY
|
0.8128
|
3.451
|
0.9526
|
2.571
|
3.7225
|
Antigenic
|
Non-toxic
|
Non-allergen
|
VSNTSKHTY
|
0.5703
|
2.4214
|
0.9448
|
2.981
|
2.7122
|
Antigenic
|
Non-toxic
|
Non-allergen
|
Table 4: List of the epitopes selected from MHC-II which are antigenic, non-allergenic, non-toxic, and can induce IFN-γ immune response.
Allele
|
Start
|
End
|
Length
|
Peptide sequence
|
Percentile score
|
Method
|
IFN-γ Result
|
Antigenicity
|
Toxicity
|
Allergenicity
|
HLA-DRB3*02:02
|
474
|
488
|
15
|
PLVVNWRNNTVISRP
|
1
|
NetMHCIIpan
|
Positive
|
Antigenic
|
Non-toxic
|
Non-allergen
|
HLA-DRB1*15:01
|
503
|
517
|
15
|
ASLCIGLITFISFII
|
0.44
|
Consensus (smm/nn/sturniolo)
|
Positive
|
Antigenic
|
Non-toxic
|
Non-allergen
|
HLA-DRB1*15:01
|
507
|
521
|
15
|
IGLITFISFIIVEKK
|
0.44
|
Consensus (smm/nn/sturniolo)
|
Positive
|
Antigenic
|
Non-toxic
|
Non-allergen
|
HLA-DRB1*15:01
|
504
|
518
|
15
|
SLCIGLITFISFIIV
|
0.44
|
Consensus (smm/nn/sturniolo)
|
Positive
|
Antigenic
|
Non-toxic
|
Non-allergen
|
HLA-DRB3*01:01
|
1101
|
1115
|
15
|
DLELASFLMDRRVIL
|
0.51
|
Consensus (comb.lib./smm/nn)
|
Positive
|
Antigenic
|
Non-toxic
|
Non-allergen
|
Table 5: Only the final building vaccine selects linear B-cell epitopes, with a binding score greater than 0.93.
Name
|
Sequence
|
Start position
|
Score
|
Allergenicity
|
Antigenicity Score
|
Antigenicity
|
Toxicity
|
polymerase
|
SYMIYLMNWCDFKKSP
|
1631
|
0.95
|
NON-ALLERGEN
|
0.4189
|
ANTIGEN
|
Non-Toxic
|
Pphosphoprotein
|
KGKGERKGKNNPELKP
|
581
|
0.95
|
NON-ALLERGEN
|
1.4193
|
ANTIGEN
|
Non-Toxic
|
polymerase
|
AALIPAPIGGFNYLNL
|
989
|
0.93
|
NON-ALLERGEN
|
0.7346
|
ANTIGEN
|
Non-Toxic
|
attachmentglycoprotein
|
SFSWDTMIKFGDVLTV
|
457
|
0.93
|
NON-ALLERGEN
|
0.8741
|
ANTIGEN
|
Non-Toxic
|
Pphosphoprotein
|
HWSIERSISPDKTEIV
|
365
|
0.93
|
NON-ALLERGEN
|
0.6304
|
ANTIGEN
|
Non-Toxic
|
polymerase
|
NHLIYDPDPVSEIDCS
|
1454
|
0.92
|
NON-ALLERGEN
|
0.5988
|
ANTIGEN
|
Non-Toxic
|
fusionprotein
|
KSSIESTNEAVVKLQE
|
148
|
0.94
|
NON-ALLERGEN
|
0.6356
|
ANTIGEN
|
Non-Toxic
|