Designing of a novel multisubunit vaccine against Nipah virus structural proteins: A reverse vaccinology approach

Background The signicant public health risk posed by NiV zoonosis and the lack of effective countermeasures against the intermittent outbreaks of the disease in the South and Southeast Asia region have entailed an imperative search for a protective vaccine to prevent or mitigate its epidemic potentiality. This is an endeavor to design an effective, safe multisubunit vaccine using an in silico reverse vaccinology approach. The epitopes used for the construction of the candidate vaccine were meticulously predicted from ve viral structural proteins (G, F, M, N, P) using several immunoinformatics tools to assess different epitope characteristics, namely, VaxiJen server for antigenicity, IEDB immunogenicity tool for immunogenicity, AlgPred server for allergenicity, ToxinPred for toxigenicity, IFNepitope server for interferon-gamma induction, Protparam server for physicochemical properties, GROMACS for simulation and simulation dynamics analysis, and nally, SnapGene tool for molecular cloning. The proposed vaccine molecule consisted of 501 amino acids, encompassing 7 B cell epitopes, 14 CTL epitopes, and 4 HTL epitopes. The physiochemical parameters of the vaccine construct showed a molecular weight of 54.6 kDa, an acidic stable molecule with an instability index of 38.3, aliphatic index of 62.89, and grand average of hydropathicity of -0.476. Moreover, the docking results and simulation dynamics of the vaccine molecule and TLR-3 showed global energy of 1.58 Kcal/mol, atomic contact energy of 2.98 Kcal/mol, and RMSD of 0.65 nm. The radius gyration showed a relatively steady value throughout the simulation period. a suggestive result of a stable compact structure and a promisingly effective vaccine construct. overall multi-subunit a promisingly NiV in with a relatively stable compact structure, experimental assessment of pathogenic priming and autoimmunity induction


Introduction
In September 1998 an outbreak of a viral origin occurred in Malaysia and Singapore, a cause of which was later isolated, identi ed, and named Nipah virus (NiV) after a Malaysian village in which it was rst discovered [1]. Since then, similar NiV outbreaks occur intermittently in the South and South-East Asia region, threatening of grave health and economic repercussions and a full-blown global pandemic [2,3]. NiV zoonosis occurs in both animals and humans, and the primary reservoir identi ed to be the fruit bat. Infection in humans ranges from subclinical to acute respiratory distress and fatal febrile encephalitis, with case fatality rates ranging from 40-100% [3,4]. The clinical presentations of the disease include severe in uenza-like symptoms such as fever, myalgia, and headache. Encephalitis symptoms appear within a week of infection followed by rapid deterioration into coma and death within few days [5]. Infection can be non-encephalitic, and cases of relapse or late-onset encephalitis have been reported [6,7].
NiV, a paramyxovirus, comprises, along with Hendra virus (HeV) and Cedar virus, the genus Henipavirus. The fruit bats of the genus Pteropus have been identi ed as infection reservoirs of NiV [8,9], and humans can get the infection through either food-borne route from food contaminated with saliva, excrement, or semen of carrier bats, or through direct contact with infected humans and animals [10,11,4]. The viral infection is mediated by the highly conserved Ephrin-B2/B3 receptors [12], which are mainly expressed in the brain, endothelial, and smooth muscle cells in the arterial vessels [13]. NiV is a single-stranded negative-sense RNA enveloped virus with a genome size of 18.2 kb, which encompasses 6 viral genes encoding 9 viral proteins: nucleoprotein N, phosphoprotein P, nonstructural proteins W, V, and C, matrix protein M, fusion protein F, glycoprotein G, and large polymerase L [14]. The binding to the host Ephrin receptors is mediated by the viral glycoprotein G, while the membrane fusion and viral entry is the function of protein F [14][15][16]. The proteins N and P form a complex that binds the viral RNA forming the nucleocapsid protein that coats the viral RNA [17]. The viral genetic material is replicated with the help of the RNA polymerase L, furthermore, protein M mediates the budding and release of viral particles from the host cells [18]. On the other hand, the nonstructural accessory proteins W, V, and C are key components for innate immune response evasion [19,10]. The World Health Organization (WHO) has prioritized NiV infection for research and development in emergency contexts by listing it among diseases pose the greatest public health risk due to their epidemic potential [10], a fact aggravated by the lack of approved protective vaccines and/or effective treatment, which entailed the search for an e cient, safe prophylactic agent against the zoonotic infection. Few attempts were made towards the development of human vaccines, some were glycoprotein G-based subunit vaccines, and others were vector-based recombinant vaccines of proteins F and G. These vaccines showed promising results in various animal models, however, not proved protective in humans [21,14,5,22]. Therefore, the aim of the present study is to exploit the immunoinformatics tools and the viral structural proteins data to design a multi-epitope vaccine capable of eliciting a protective humoral and cell-mediated immune response against NiV. Two major genetic lineages of NiV were identi ed thus far, namely, NiV Malaysia, and NiV Bangladesh, share 92% of nucleotide homology and most of the variations reside in the region of non-structural genes [23,24,4], therefore, here, the construction of the candidate vaccine was based entirely on the viral structural proteins.
Retrieval of viral protein sequences
Cytotoxic T lymphocyte (CTL) epitopes prediction The CTL epitopes from all proteins with antigenicity score more than 0.4 were predicted using the arti cial neural network algorithm-based NetCTL 1.2 server (http://www.cbs.dtu.dk/services/NetCTL/). The process is based on MHC-I binding peptide prediction, proteasomal C-terminal cleavage, and transportation e ciency of transporter-associated antigen processing (ATP). The threshold value was set to 0.75, which indicates 80% sensitivity and 97% speci city [26].
The predicted CTL epitopes were then screened independently for antigenicity using Vaxijen 2.0 server. This is followed by scanning the antigenic peptides for toxicity using the ToxinPred server, with a threshold value of 0.0 (https://webs.iiitd.edu.in/raghava/toxinpred/multi_submit.php) [27]. The nontoxic epitopes were then subjected to immunogenicity screening using the class I immunogenicity tool of Immune Epitope Database (IEDB) version 2.22
The resultant epitopes with low percentile ranks were then screened for allergenicity with AlgPred server (https://webs.iiitd.edu.in/raghava/ algpred/submission.html), using the support vector machine (SVM) module-based on the amino acid composition as the prediction approach [30]. The nonallergenic epitopes were then screened for antigenicity and toxicity status with VaxiJen 2.0 server and ToxinPred server, respectively. To explore the ability of the resultant epitopes to induce the production of Interferon-gamma (IFN-) by MHC-II activated CD4 + T helper cells, IFNepitope server (http://crdd.osdd.net/raghava/ifnepitope/) was used following the Motif and SVM hybrid approach [31]. The predicted epitopes were then manually inspected for overlapping.

Population coverage
The worldwide population coverage of the predicted epitopes for MHC-I and MHC-II alleles was conducted using the population coverage tool of IEDB (http://tools.iedb.org/population/) [32].

B-cell epitope and antibody prediction
The B cell epitopes are key players in humoral immunity, they bind surface immunoglobulins and derive the B cells into the immune response [33]. The antibody epitope prediction tool of the Immune Epitope Database and Analysis Resource (IEDB) (http://tools.iedb.org/bcell/) was used to predict the linear B cell epitopes with BepiPred linear epitope prediction method [34], Emini surface accessibility prediction method [35], and Kolaskar and Tongaonkar antigenicity method [36]. The epitopes were selected for inclusion in the proposed vaccine construct only if they were predicted by the former prediction method and ful lled the surface accessibility and antigenicity of the latter two methods. The class of antibodies induced by these epitopes was predicted using the IgPred server (https://webs.iiitd.edu.in/raghava/igpred/) by scanning antibody-speci c motifs in the peptide sequence [37].

Multiepitope vaccine construction
For devising the nal vaccine construct, the predicted CTL, HTL, and B cell epitopes were joined together using linkers to ensure e cient vaccine construction and proper subunit separation. The B cell epitopes were linked together and with CTL epitopes with AAY linker, while HTL epitopes were linked together and with CTL epitopes with GPGPG linker. At the N-terminal, a cysteine residue was added to facilitate the future conjugation of the vaccine construct with the carrier protein [38]. Furthermore, to ensure e cient puri cation of the proposed vaccine, a four amino acid (EPEA) tag was added at the C-terminal of the construct [39]. To explore any potential autoimmunity, BLASTp of the candidate vaccine against the human proteome in the Uniprot database was conducted to screen for considerable similarity with human proteins indicated by > 40% identity and E-value [40]. The nal construct was also assessed for antigenicity and allergenicity using VaxiJen 2.0 server and AlgPred server, respectively. The ExPASy ProtParam tool (https://web.expasy.org/protparam/) was used to determine the physicochemical properties, including hydropathicity, charge, instability, half-life, the value of theoretical isoelectric point, and molecular weight [41]. γ Structure modeling and validation Initially, the secondary structure of the candidate vaccine was determined using the PSIPRED server (http://bioinf.cs.ucl.ac.uk/psipred/) [42]. The candidate vaccine 3D structure homology-modeling was carried out using the ExPASy SWISS-MODEL tool. The resultant structure was further analyzed and validated with the ProSA-web server (https://prosa.services.came.sbg.ac.at/prosa.php) [43], Ramachandran plot analysis using MolProbity server (http://molprobity.biochem.duke.edu/logout_destroy.php) [44], and ERRAT server (https://servicesn.mbi.ucla.edu/ERRAT/) [45].

Molecular docking and simulation
The binding and interaction of the antigenic molecules with a speci c immune receptor is a key step for proper e cacious immune response, therefore, it is essential to examine the interaction between the resultant vaccine and a class of pattern recognition receptor that is responsible for the initiation of the innate immune response. The vaccine-Toll-like receptor 3 docking was carried out using the PatchDock server (https://bioinfo3d.cs.tau.ac.il/PatchDock/) [46], and HADDOCK server 2.4 (https://wenmr.science.uu.nl/haddock2.4/) [47] using servers' default parameters.
The docked complex was then subjected to pre-simulation modi cation using protein prep wizard [48] and PyMOL software. The molecular simulation and molecular dynamics analysis were carried out using Groningen Machine for Chemical Simulations (GROMACS) molecular dynamics package, using the CHARMM TIP3P force eld, and TIP4P-FQ water model. The calibration and charge neutralization of the simulation system were carried out. The system temperature and pressure were equilibrated at 300 K and 1 bar for a 100 ns equilibration period with an interval of 2 fs. The simulation process was visualized using the VMD tool. The root mean square deviation (RMSD) and radius of gyration analyses were determined.

Immune response simulation
The immune response to the proposed vaccine was carried out using C-ImmSim server 10.1 (http://www.cbs.dtu.dk/services/C-ImmSim-10.1/) [49], which is an agent-based tool that uses a position-speci c scoring matrix (PSSM) and machine learning for the prediction of immune responses. It simulates the production of lymphoid and myeloid cells by hematopoietic stem cells in the bone marrow and the immune response in the thymus and lymph node. The simulation parameters used were random seed: 12,345, simulation steps: 100, and simulation volume: 10 L. The default injection schedule with the antigen name, injection time: 0, and injection amount: 1000.
In silico molecular cloning The amino acid sequence of the vaccine construct was subjected to reverse translation and codon optimization with the JAVA codon adaptation tool (Jcat) (http://www.jcat.de) [50]. The GC content of the resultant DNA sequence is 52% suggesting e cient translation and transcription. The sequence was then used for in silico molecular cloning using expression plasmid vector pET28 (+) of E. coli (strain K12) [51], and NcoI and NotI restriction sites were introduced the N and C-terminals of the construct, respectively. The molecular cloning was carried out using SnapGene tool version 5.2.

T cell epitopes prediction
The antigenicity screening of the retrieved NiV protein sequences showed a score greater than the threshold value of 0.4 for glycoprotein G (0.5095), fusion protein F (0.5395), nucleoprotein N (0.5594), and phosphoprotein P (0.5767), suggesting probable antigenicity, while the antigenicity score of the matrix protein M (0.3969) was indicative of non-antigenic nature of the protein, therefore it was excluded from further analysis. The 4 antigenic proteins were then submitted to the NetCTL server for CTL epitope prediction, resulting in 12 potential epitopes from protein G, out of which 11 were nontoxic and showed a positive immunogenicity score, all were included in the nal vaccine construct. For protein F, 21 CTL potential epitopes were predicted, of which 4 were nontoxic, of these, 2 with a positive immunogenicity score were included in the vaccine construct. Eight CTL potential epitopes were predicted from the nucleoprotein N, all were nontoxic, and the 2 included in the nal vaccine construct were the only ones with a positive immunogenicity score. For phosphoprotein P, 10 CTL potential epitopes were predicted, 5 of which were nontoxic and only 3 epitopes were selected for the nal vaccine construct for their positive immunogenicity score. All the ultimately selected epitopes for inclusion in the vaccine construct showed antigenicity scores greater than the threshold value (Table 1). Percentile rank less than 10 was used as the cut-off value for the selection of predicted HTL epitopes which resulted in 36 peptides from protein G, one of which was selected for inclusion in the nal construct, despite 5 were non-allergenic, 3 were nontoxic and 2 were antigenic, but only one showed IFNinducibility. For protein F, none of the 18 predicted HTL epitopes showed IFN-inducibility, while 2 epitopes were selected from the 20 predicted HTL epitopes from nucleoprotein N, and one epitope from protein P, that ful lled the selection criteria (Table 2). The analysis of world population coverage of the selected epitopes for MHC-I and MHC-II alleles. The coverage of HLA class I restricted epitopes was 98.59% of the world population, and the coverage of HLA class II restricted epitopes was 69.81% of the world population, while the coverage of the combined HLA class I and class II was 99.57% of the world population, this suggests a remarkably high population coverage of the given multisubunit vaccine (Fig. 1).

B cell epitope prediction
The B cell epitopes derive their importance from being key players in the adaptive humoral immune response. Upon their recognition by B lymphocytes, they trigger antibody production which is an important defense process against viral infection. Three methods were employed to ensure proper epitopes prediction, epitopes predicted by the Bepipred prediction method were assessed for surface accessibility, exibility, and antigenicity by Emini and Kolaskar & Tongaonkar methods, respectively. Only epitopes predicted by the three methods were included in the nal vaccine construct. Ultimately, 7 epitopes of variable sequence length were predicted by the three methods, one from proteins G, F, M, and P, each, and three from protein N. All predicted epitopes induce IgG antibodies.

Multi-subunit vaccine construction
The nal vaccine construct comprised of 7 B cell epitopes, 14 CTL epitopes, 4 HTL epitopes, 2 linkers, EPEA tag at the C-terminal, and a cysteine residue, resulting in a sequence of 501 amino acids: CAAYVRFENTTSDKGKIPSKVIKSYYGTMDIKKINEGLLDAAYNTYSRLEDRRVRPTSSGDLYYAAYIKSISSESMEGVSDFSPSSWEHGGYLDKVEPEIDENGAAYKTARDSSK Physiochemical properties of the vaccine construct The properties of the novel vaccine assessed by the ProtParam server showed a molecular weight of 54.6 KDa and a theoretical isoelectric point (PI) of 4.78, suggesting an acidic nature, a total of 61 negatively charged residues and 41 positively charged residues, which make it a readily soluble protein in blood pH.
The estimated half-life is 1.2 hours in mammalian reticulocytes in vitro, >20 hours in yeast in vivo, and >10 hours in E. coli in vivo. The instability index was computed to be 38.30, which classi es the protein as a stable molecule. The vaccine construct also showed an aliphatic index of 62.89, indicating a thermostable protein. The grand average of hydropathicity (GRAVY) was -0.476, a negative value means a hydrophilic molecule.
Tertiary structure modeling and analysis The results of secondary structure prediction by PSIPRED sever showed predominant helix and coil structures (Fig 2), homology modeling was then carried out using the SWISS Model tool to predict the tertiary structure of the multi-subunit construct which resulted in 29 modeling templates. The top three templates with the best sequence coverage and larger identity 62.14% each, were selected for building the 3D models of the protein. The quality of these models was then assessed and validated using Ramachandran plot, ERRAT tool to calculate the overall quality factor, VERIFY3D tool, and ProSA-web z-score based on xray crystallography and NMR analysis. The best-predicted model showed 93.1% of the residues fall within the favorable region in the Ramachandran plot (Fig   3), and overall quality factor of 90.426% calculated by blotting the statistics of non-bonded interactions between different atom types and the error function value, against a position of a 9-residue sliding window, compared with statistics from a highly re ned structure. The value is less than the 91% threshold for an average quality (Fig 4). The model also showed a Z-score of -3.84 determined by both x-ray crystallography and NMR analysis (Fig 5) Molecular docking and dynamics analysis The docking of the novel vaccine construct with Toll-like receptor 3 (PDB ID: 1ziw) was performed with the PatchDock server, the generated docking solutions were sorted based on the geometric shape complementarity score, the approximate interface area of the complex, and the atomic contact energy. The top 10 protein-receptor docking solutions were submitted to the FireDock server for interaction re nement. The selected docking solution had a globale energy of -1.58 Kcal/mol, attractive van der Waals force of -9.13 Kcal/mol, repulsive van der Waals force of 1.83, Kcal/mol, and atomic contact energy of 2.98 Kcal/mol (Fig.6). The molecular dynamics simulation results showed a root mean square deviation (RMSD) of 0.65nm and the radius of gyration value with a relatively minor uctuation throughout the 1000 ps period of simulation (Fig. 7,8).

Immune response simulation
Assessing the immune response to the designed vaccine is an important step towards ensuring vaccine e ciency. The simulation results showed increased and sustained levels of memory and active B cell population. A similar result was shown by T cell population, the memory and active T helper cells maintained a high level for the entire simulation period. The immunoglobulins level, on the other hand, peaked by the second week of simulation and then regressed to relatively acceptable levels nonetheless. The concentration of cytokines and interleukins showed for more than half the simulation period followed by a gradual decline (Fig. 9), suggesting sustained humoral and cell-mediated immune response.
In silico molecular cloning The Java codon adaptation tool was used to optimize the novel vaccine sequence prior to the insertion of the vaccine DNA sequence into the pET28 (+) plasmid of E. coli (strain K12). The GC content of the vaccine DNA sequence was 51.8 % suggesting good expression of the vaccine in the E. coli host (Fig.  10):

Discussion
The catastrophic consequences of the ongoing COVID-19 pandemic have entailed a preemptive development of vaccines for all pandemic-prone diseases. NiV infection is a zoonotic disease that can be transmitted to humans directly from natural reservoirs or indirectly from other infected animals or through direct contact with other infected humans, underscoring the potential pandemic nature of the disease. Several approaches were used to develop vaccines against animal NiV infection, including the use of glycoprotein G, and protein F subunits as immunogens, non-replicating vectors, and DNA vaccine. These trials showed very promising outcomes. However, as yet, there are no approved vaccines against human infection, despite a number of ongoing clinical trials [52][53][54][55][56]. In the present study, a multi-subunit vaccine was designed using the virus structural proteins, which are moderators of the infection process [2]. Unlike single viral protein-based vaccines, multi-epitope vaccines are believed to provide better immunity by inducing a broader immune response [57]. Furthermore, single viral protein-based vaccines are inclined to lose their e ciency due to the accumulation of mutations in the target protein, a phenomenon seldom noted in the multisubunit vaccine [58]. For designing the current candidate vaccine, initially, 6 NiV structural proteins were selected, however, upon primary screening for antigenicity, the large polymerase protein (L) was excluded due to lack of antigenicity. Then the remaining 5 proteins were screened for potential epitopes, none of the CTL and HTL epitopes predicted from matrix protein (M) was eligible for selection in the nal vaccine construct due to lack of antigenicity or toxicity. Proper presentation of these epitopes by the antigen-presenting cells (APC) and rapid and e cient immune response contingent upon accurate digestion of the constructed vaccine molecule by the proteasomes of different immune cells and binding transporters associated with antigen processing (TAP), therefore, the predicted epitopes were linked together using linkers that facilitate these cleavage and binding processes [59]. The addition of a cysteine residue to the N-terminal of the vaccine construct enhances the binding of the vaccine to the protein carrier, while the four amino acids EPEA were added to the C-terminal to facilitate the vaccine downstream puri cation [39]. The molecular weight of the candidate vaccine (54.6 kDa) falls well with the optimum molecular weight for vaccines  [60]. The physiochemical properties results showed a soluble, thermostable, hydrophilic construct, suggesting an adequate vaccine accessibility, absorption [61], and rapid sustainable immune response. The secondary structure of the protein molecule is a key determinant of its tertiary structure, the predicted structure revealed a predominant helix and coiled coil secondary structures.
The validation results of the 3D structure model of the vaccine molecule showed an acceptable model with 93.1% of the residues falling within the favorable region in Ramachandran plot and overall quality score of 90.4% in addition to a Z-score of -3.84 [62]. An appropriate immune response requires recognition and binding of the vaccine molecule by the toll-like receptors, therefore, such binding between the vaccine and the TLR-3 was studied through docking of the two molecules and the strength and stability of the interaction were assessed by molecular simulation and dynamics analysis, the estimated potential energy and RMSD plot showing RMSD value of 0.65 nm for the backbone of the docked complex, consequently, stable vaccine-TLR3 complex. Despite the promising results of the designed vaccine construct, the present study has nonetheless some limitations, including the need for experimental validation of the vaccine e ciency and further investigations on pathogenic priming and autoimmune induction are warranted.

Conclusion
The inclusion of the NiV infection in the WHO priority disease list for research and development in emergency contexts underlines the importance of rapid vaccine design and development for an emerging disease with epidemic potential. The current study is an attempt to develop a multisubunit vaccine effective enough to mount both humoral and cell medicated immune response to the viral structural proteins using reverse vaccinology tools.
In conclusion, the proposed vaccine ful lled all antigenicity, immunogenicity, allergenicity, and physiochemical requirements of an e cacious multi-subunit vaccine against NiV infection in human.
Declarations Figure 2 The results of secondary structure prediction by PSIPRED sever showed predominant helix and coil structures.

Figure 3
The best-predicted model showed 93.1% of the residues fall within the favorable region in the Ramachandran plot. Figure 5 The model also showed a Z-score of -3.84 determined by both x-ray crystallography and NMR analysis.

Figure 6
The selected docking solution had a globale energy of -1.58 Kcal/mol, attractive van der Waals force of -9.13 Kcal/mol, repulsive van der Waals force of 1.83, Kcal/mol, and atomic contact energy of 2.98 Kcal/mol.