In silico analysis of NHP2 membrane protein, a novel vaccine candidate present in the RD7 region of Mycobacterium tuberculosis

Mycobacterium tuberculosis, the etiological agent of tuberculosis, is one of the most tricky pathogens. We have only a few protective shields, like the BCG vaccine against the pathogen, which itself has poor e�cacy in preventing adult tuberculosis. Even though different vaccine trials for an alternative vaccine have been conducted, those studies have not shown much promising results. In the current study, advanced computational technology was used to study the potential of a novel hypothetical mycobacterial protein, identi�ed by subtractive hybridization, to be a vaccine candidate. NHP2 (Novel Hypothetical Protein 2), housed in the RD7 region of the clinical strains of M. tuberculosis, was studied for its physical, chemical, immunological and structural properties using different computational tools. PFAM studies and Gene ontology studies depicted NHP2 protein to be functionally active with a possible antibiotic binding domain too. Different computational tools used to assess the toxicity, allergenicity and antigenicity of the protein indicated its antigenic nature. Immune Epitope Database (IEDB) tools were used to study the T and B cell determinants of the protein. The 3D structure of the protein was designed, re�ned and authenticated using bioinformatics tools. The validated tertiary structure of the protein was docked against the TLR3 immune receptor to study the binding a�nity and docking scores. Molecular dynamic simulation of the protein-protein complex formed were studied. NHP2 was found to activate host immune response against tubercle bacillus and could be explored as a potential vaccine in the �ght against tuberculosis.


Introduction
Mycobacterium tuberculosis is a pathogen of global threat killing more than 1 million individuals yearly.According to the World Tuberculosis Report of 2019, 9.9 million individuals were infected with TB and multidrug resistant TB contributed to about 0.6 million cases [WHO Global Tuberculosis Report 2020].
The development of Multi Drug Resistant TB (MDR-TB) has made tuberculosis more di cult to treat (Mirzayev et al. 2021).Immuno-compromised individuals and people with HIV infection are unable to respond immunologically against tuberculosis leading to high death rate.Studies show that upon exposure to M. tuberculosis proteins, the CD4 + and CD8 + cell mediated immune responses are induced further by secreting cytokines.Bacillus Calmette Guerin (BCG) is the only vaccine against tuberculosis since 1923, the e cacy of which is less than 80% against adult pulmonary tuberculosis.Hence more efforts should be taken to develop advanced and more e cient TB vaccines (Kaufmann 2020).
Different regions like RD1 (Region of Differences), RD3, RD7, RD9 and RD11 of the mycobacterial genome were found to be immunologically important and were extensively studied in the search for potential vaccine candidates.Mycobacterial proteins like Antigen 85 complex and ESAT-6 family proteins with low molecular weight and high GC content play a major role in the pathology and virulence of the bacteria (Gomez et al. 2000).Using secretory proteins, different live attenuated vaccine trials have been undertaken, of which peptide vaccines were considered more e cient and safer due to epitope based immune response (Mustafa 2021).After clinical trials, M72/AS01E, a peptide vaccine, was considered to be clinically potent in tuberculosis patients (Ji et al. 2019).H4/IC31, a peptide vaccine against tuberculosis, has also been considered effective in adult pulmonary tuberculosis, after phase I clinical trial (Nemes et al. 2018).
Different approaches have been executed in developing vaccines against adult pulmonary tuberculosis of which peptide vaccines and subunit vaccines have shown great response.Responses of subunit epitopes of M. tuberculosis proteins against human T and B cells were studied using different computational tools (Ejalonibu et al. 2021;Albutti 2021).The current study focuses on identifying the potential of NHP2, a hypothetical protein of M. tuberculosis as a vaccine candidate using different bioinformatics tools.The emergence of many computational tools and software has played a major role in vaccine designing which is equally e cient when compared to the normal conventional strategies.A comparative study (conducted by our group earlier) between the clinical strains of M. tuberculosis from South India and laboratory strain H37Rv showed the presence of a unique 4.5kb genomic sequence unique to clinical isolates, absent in H37Rv.NHP2 protein (Novel Hypothetical Protein 2) in this 4.5 kb locus has been mapped to the genome in the RD7 region, intergenic to Rv1976c and Rv1977.Various RDs (Region of difference) and RvDs (Rv deletions) were earlier investigated by our research group (Sarojini et al. 2005;Soman et al. 2007;Sarojini et al. 2011).BLAST studies performed earlier have shown 99% similarity in sequences of NHP2 region with Mycobacterium canetti, a primitive strain of tuberculosis (Sarojini and Mundayoor. 2021).The ancient nature of NHP2 protein located in the RD7 region makes the protein more signi cant to be studied for immunological properties and its possibility as a vaccine candidate.
Homology structure development and functional properties of a hypothetical protein, NHP1, present in the RD7 region was previously studied by our group using computational tools (Kootery et al. 2022).Multiple tools were used in studying each property of NHP2, in order to get more con rmation on the results.The physical and the chemical properties of the protein was studied using Expasy ProtParam and SMS Suit software.The conserved domain search and evolutionary functions of the protein were studied using NCBI CDD-BLAST, SMART and PFAM.The cellular location of the protein and the T and B cell epitopes of the NHP2 Protein were studied using multiple tools.3-D structure of the protein was developed using Homology protein modeling and the developed structure was re ned using Galaxy web Server, further validated using Ramachandran plot, Z score and Verify 3D.Molecular docking and molecular dynamic simulation studies of the protein with Human TLR3 cell receptor were carried out and validated using different computation tools.The computational study of NHP2 protein showed the protein to be an e cient candidate as a tuberculosis vaccine.Further studies using animal models should be conducted to understand the immunological property of the NHP2 protein.

Sequence retrieval
The nucleotide sequence of the subtracted genomic region, N4.5 was retrieved from the National Center for Biotechnology Information (NCBI);GenBank Accession Number -GU994138.2-deposited earlier using data which was previously studied (Sarojini and Mundayoor. 2020).One of the potential ORFs in the region, 807 bp long, was translated to its corresponding 268 amino acid sequences using the EMBL Translate tool (https://www.ebi.ac.uk/Tools/st/emboss_transeq/).This amino acid sequence was further used for different physical, chemical, immunological and structural studies using various bioinformatics servers.

Physico-chemical characterization
Expasy's ProtParam server was used for the chemical and physical characterization of the 268 amino acid long protein (https://web.expasy.org/protparam/)(Gasteiger et al 2005).This tool predicts the chemical formula, molecular weight, pH, extinction coe cient and hydrophobicity of the protein.SMS v2.0 (Sequence Manipulation Suite) (https://www.bioinformatics.org/sms2/)(Stothard 2000) server was also used to study the physical and chemical properties of the query protein in order to get similar results for con rmation.

Subcellular localization
The subcellular localization tools assist in identifying if the protein is adhered to the outer cellular membrane or to the cytoplasmic membrane.The site of location of NHP2 was studied using different cellular localization prediction servers including TBpred (https://webs.iiitd.edu.in/raghava/tbpred)(Rashid et al 2001), CELLO (subCELlular LOcalization prediction) (http://cello.life.nctu.edu.tw/)(Yu et al 2004), TMHMM (http://www.cbs.dtu.dk/services/TMHMM/)(Möller et al 2001).The probability of the query protein to be a signal peptide or lipo protein along with or without the transmembrane helices was also checked using different servers like SignalP (https://services.healthtech.dtu.dk/service.php?SignalP-4.

Virulence factor analysis
The pathogenicity of the protein was studied with the aid of VirulentPred server (http://203.92.44.117/virulent/ ) (Garg and Gupta 2008) by predicting the virulence factor of the protein.

B-Cell epitope prediction
The ability of NHP2 to induce B cell immunity was studied using different servers like BcePred, ABCpred and BepiPred.BcePred (https://webs.iiitd.edu.in/raghava/bcepred/bcepred_submission.html)(Saha and Raghava 2004) and ABCpred (https://webs.iiitd.edu.in/raghava/abcpred/ABC_method.html)(Saha and Raghava 2006) servers were used to predict the linear B cell epitopes.In ABC Pred a threshold value of 0.5 was used with an epitope length of 16 amino acids.IEDB BepiPred (http://tools.iedb.org/bcell/)(Jespersen et al 2017) server was used to study the B cell epitopes of NHP2.The physical and chemical characteristics of 268 amino acid long protein with the Hidden Markov Model were used in the BepiPred tool to predict the linear B cell epitopes.

Tertiary structure model
Homology Protein modeling was used to study the three-dimensional arrangement of atoms of NHP2 protein.The homology protein model of the NHP2 protein was developed using MODELLER (https://salilab.org/modeller/)(Šali et al 1995) and Robetta (https://robetta.bakerlab.org/)(Kim et al 2004) servers.The developed 3-D protein structure was substantiated using servers, PROCHECK (http://www.csb.yale.edu/Excite/AT-CSBquery.html) (Laskowski et al 1993) and Verify 3D (https://saves.mbi.ucla.edu/)(Eisenberg et al 1997).The stereochemical parameters of the developed 3D structure of the protein along with the bond angles and lengths of the protein structure were studied using PROCHECK.The Z-score validation of the protein was done using ProSA-web (https://prosa.services.came.sbg.ac.at/prosa.php)server (Wiederstein and Sippl 2007).Galaxy Re ne -Galaxy WEB server (http://galaxy.seoklab.org/cgi-bin/help.cgi?key=METHOD&type= REFINE) (Ko et al 2012) was used to improve the predicted homology model.The Root Mean Squar Deviation (RMSD) value of the structure was analyzed using Galaxy WEB Re ne server.The ligand binding site of the structure was predicted using Prank web P2Rank server (https://prankweb.cz/)(Jendele et al 2019; Krivák and Hoksza 2018).

Gene ontology study
The molecular function, cellular component, molecular and biological process of the NHP2 protein were studied using Gene Ontology (QuickGO) server (https://www.ebi.ac.uk/QuickGO/GTerm?id=GO:0008658) (Ashburner et al 2000).GO server is the largest database of information on gene functions.A Cscore GO value will be developed based on global and local similarity evaluation within a range of 0-1 where higher value indicates higher con dence prediction.

Molecular Docking and Molecular Dynamics Simulation
NHP2 protein was docked with human TLR-3 immune receptor to see the binding a nity.TLR3 (PDB ID-1ZIW) structure was downloaded from Protein Data Bank (https://www.rcsb.org).Online docking servers including ClusPro 2.0 (https://clusp ro.bu.edu/login .php)(Kozakov et  Mechanics with Generalized Born Surface Area (MM-GBSA) Score was measured using Hawk Dock server, the lowest prediction score was considered as optimal score.Molecular Dynamics Simulation studies of the docked TLR3-NHP2 complex were carried out using iMODS web-server 9http://imods.Chaco nlab.org/0(Blanco et al 2014).iMODS is a free and fast MD Simulation server used for calculating dynamics and exibility of the docked proteins.

Physico-chemical characterization
In silico studies of NHP2, the 268 amino acid long hypothetical protein carried out using Expasy Protparam and SMS Suite server revealed the physical and chemical characterization of NHP2.According to the tools used, NHP2 had a molecular weight of 28.149 kDa, pH 7.38 (neutral) and a theoretical pI of 6.82, implying the protein to be negatively charged.The half-life of NHP2 was estimated as 30 hours.NHP2 exhibited an average GRAVY of -0.015, pointing that the protein is not very hydrophobic.Various physicochemical parameters of NHP2 are depicted in Table 1.

Subcellular Localization
The cellular localization of the 268 amino acid long sequence studied using SOSUI, TMHMM, CELLO, TBPred and TMBB Pred predicted NHP2 to have a transmembrane signal peptide.TB Pred, CELLO, SOSUI and TMBB pred depicted NHP2 to be extra cellular lipid anchored soluble protein.According to SignalP, NHP2 was proposed to be a lipoprotein signal peptide with a threshold of 0.7035, having a cleavage site between 23 and 24 as depicted in Fig. 2.
TMHMM server showed the protein with a transmembrane helix from amino acid 12-35 (Fig. 3) The transmembrane helix from regions 12-35 and the signal peptide were predicted at the N terminus by the InterPro tool.The PFP-FunDSeqE tool predicted that the protein folding pattern of NHP2 would be "viral coat and capsid proteins," demonstrating the extracellular nature of the protein.
The immunological potential of the protein is decided by the antigenic, allergenic and toxicity levels of the protein.VaxiJen server with which the antigenicity was studied works according to the physical and chemical parameters of the 286 amino acid long protein and a default threshold value of 0.4 was chosen as the antigenicity score, where the NHP2 protein scored an antigenicity score of 0.4789 making it antigenic in nature.With an antigenic threshold of 0.684597, the query protein's antigenicity was also predicted using the ANTIGENpro server.The query protein was predicted to be a probable non-allergen by AllerTOP v.2.0 and AllergenFP servers and to be harmless by ToxinPred data.NHP2 was predicted by the Virulent Pred server to be non-virulent.VirulentPred is a server with 81.8% accuracy used to study the virulence of a protein using the SVM method.The output thus showed the antigenic, allergenic, nonvirulent and nontoxic nature of NHP2.

T-cell Epitope Prediction
Antigen-antibody interaction plays a major role in vaccine studies.Antibodies bind to the target antigen molecule in a speci c pattern or on a structure called epitope further developing immunity.T cell and B cell epitopes are key factors in developing immune responses against infectious diseases.Ten T cell epitopes for the NHP2 protein were predicted by HLA-Pred.Genomic molecular mimicry scanning of predicted epitope binders against several species revealed no similar sequences, making NHP2 a possibility for a vaccination.As shown in Table SI 1, the NetCTL server predicted four MHC ligands over the threshold of 0.75.T-helper cell epitopes for allele HLA DR were predicted using IEDB MHC II binders with an adjustable rank ranging from 0.02 to 100.The recommended threshold value is 2; lower adjusted rank ranging from 0.02-2.2are considered good binders and is shown in Table SI 2. The most essential factor of a peptide to become a T-cell epitope is that it should form a complex with MHC-I molecules.The ability of NHP2 protein residing on the cell surface to bind with MHC-I for complex formation was studied using the server.Class I pMHC T cell immunogenicity predictor gave an output with a threshold ranging from − 0.02-1.09and is shown in Table SI 3.

B-cell Epitope Prediction
The e cacy of NHP2 to induce B-cell epitopes was studied.ABCpred server working with arti cial neural networks was used to study the linear B cell epitope of NHP2 protein with a prediction accuracy of 65.93%.ABCpred server predicted 27 epitopes with a threshold value of 0.56-0.90where the default threshold value was 0.51 and is depicted in Table SI 4. BcePred predicted NHP2 protein's a nity towards ten B-cell epitopes with an accuracy of 58.7% at 2.3 threshold.Linear B cell epitopes were predicted using BcePred which is predicted with a combination of physico-chemical parameters like hydrophobicity, polarity and exibility at a threshold of 2.38.The amino acid segments above a threshold value of 2.38 is considered as a linear B-cell epitope.Bepipred is a server based on random algorithm trained using the epitopes annotated for antibodies from already studied antigen-antibody protein structure.BepiPred predicted the a nity of B epitopes of NHP2 protein with a threshold ranging from 0.2-0.722and is represented in Fig. 4.

Protein Secondary Structure Prediction
The secondary structure of a protein plays an important role in predicting the tertiary structure and protein folding patterns.Different computational tools including SOPMA, GOR4 and PSIPRED were used to study the secondary structure of NHP2 protein.SOPMA predicts the secondary structure of the query protein by searching against a database of proteins having similar functions and evolutionary signi cance.Table SI 5 shows the results predicted using SOPMA and GOR4.SOPMA results indicated short homology protein sequences of amino acids developed with a prediction probability of 69.5%.SOPMA predicted 31.72%alpha helical residues, 16.04% extended strands, 3.75% beta turns and 48.51% random coils.The GOR IV method is based on information theory of the probability of amino acids to form a secondary structure with 64.4% accuracy.GOR4 predicted 34.7% alpha helices, 14.55% extended strands and 50.75% random coils.PSIPRED is a tool to predict protein secondary structure using Gen Threader and MEMSAT for protein folding identi cation and transmembrane topology.PSIPRED predicts secondary structure together with the transmembrane topology and helix along with the domain recognition sites.The secondary structure of the query protein is predicted based on the physical and chemical characterization of the amino acids and is represented in Fig. 5. PSIPRED predicted the NHP2 protein as a membrane protein with a transmembrane helix similar to other tools used.Different web servers like GOR4, SOPMA and PSIPRED gave a similar structure to the NHP2 protein based on the physical and chemical characteristics of amino acids.

Protein Tertiary Structure Modeling
The PDB database's MODELLER service was used to choose the protein structure that is most similar to the NHP2 protein.The MODELLER server reports that the 4TMD A template, which had an E-value of 1.1e-16 and a target length of 195 and aligned coils of 181, was selected with a probability of 99.7%.A PDB le containing the anticipated model was created.Using the Robetta server, 3-D protein structure was created (Fig. 6).(a).Robetta is an automated de novo structure prediction programme that uses homology modeling to forecast protein structure.The servers PROCHECK, VERIFY3D, and ProSA were used to analyze the produced 3-D structure.The Ramachandran Plot was used to assess the Psi and Phi angles of the 268 amino acids that make up the proposed protein.The 3-D structure of the NHP2 protein displayed an 84.8% favorable Ramachandran plot, with 196 amino acid residues in favorable regions and 30 residues in permitted regions.There were 231 residues that were not proline or glycine.The resulting model's precision was increased using side chain perturbation on the Galaxy Re ne server.Figure 6(b) display the protein's re ned picture as predicted by the Galaxy Re ne Web server and RMSD value of 0.632, a galaxy energy of 6514.38 and a Ramachandran plot favorability value of 93.6% were obtained from the improved 3D structure using the Galaxy web server Fig. 6 (d). Figure 6(c) shows the top-ranking ligand binding pocket locations.Using the ProSA web server, the Z-score of the generated structure was determined.According to Fig. 6(e), the 3-D structure's Z score was calculated to be -7.38.Using the Verify 3D server, a 3D-1D structural con rmation score of 86.94% was projected.The generated structure's ligand binding pocket locations were created utilizing the P2Rank server.A server called P2Rank uses machine learning to forecast the locations where ligands will bind to proteins in their three-dimensional (3D) structures.Based on ranking, the locations with a high ligand binding a nity were discovered.

Gene Ontology Study
Gene ontology studies help in identifying the molecular level function, cellular level function and biological function of genes.According to Quick GO server, the molecular function of the protein is Penicillin binding with a Cscore GO of 0.38 and is shown in Fig. 7.The biological process of NHP2 was shown as response to chemicals with a Cscore GO of 0.50, and the cellular component of NHP2 was studied to be an intrinsic component of membrane with a Cscore GO of 0.50.The gene ontology studies again prove the protein to be a membrane protein attached to the membrane with a GPI anchor.The biological process was predicted as "response to chemicals" similar to the output of PFAM.The gene ontology studies con rm NHP2 as a membrane protein which plays an important role in chemical functioning of the bacteria.

Molecular docking.
Effective responses from molecular docking depend on interactions between a protein and a receptor molecule.Figure 8 shows the protein-protein docking of the vaccine candidate NHP2 with the TLR 3 protein.Figure 9 shows a few of the polar contacts between the amino acid residues of the two NHP2-TLR3 proteins.When a 3D structure is provided as input, docking will be carried out using a shape matching algorithm with a score based on the complementarity of geometric shapes as the output.The PRODIGY tool and Fire Dock server were used to do additional research on the docked protein complexes created by Clus pro, Patch dock, and Tong Dock.Fire Dock is a web page for ne-tuning interaction in molecular docking, and PRODIGY is a web server for calculating the binding energy of the protein-protein docked complex.At a temperature of 25 0 C, the binding free energy score determined by the PRODIGY tool was − 17.8 kcal/mol, and the dissociation constant value was 9.5E-14 Kd (M).There were 19, 28, 48, 13, 34, and 44 interfacial connections between charged, polar, and apolar atoms.There were also 13, 28, and 48 contacts between charged, polar, and apolar atoms.According to the PRODIGY web server, there were 25.04% charged non-interacting surfaces between NHP2 and TLR3 and 34.76% non-interacting polar surfaces.According to Fire Dock, the docked complex has a global energy of 13.26 kcal/mol, a hydrogen bond strength of -4.21, an attracting van der Waals force of -27.39, and a repulsive van der Waals force of 66.23.The TLR3-NHP2 protein complex's ranking scores and binding free energy were investigated using Hawk dock.In Hawk dock, the ligand (NHP2) rotates to nd the binding energy while the receptor (TLR3) stays in place.The binding free energy of the TLR3-NHP2 complex was discovered to be -38.87kcal/mol.Pymol was used to investigate how the proteins NHP2 and TLR3 interact.

Molecular dynamic simulation.
Molecular Dynamics simulation studies is a computer-based method which helps in understanding the molecular movements of biomolecules.Molecular dynamic studies help in understanding the conformational changes and energy required for the docking proteins.The molecular dynamics simulation and NMA (normal mode analysis) of NHP2-TLR3 docking complex was studied and represented in Fig. 10(a).NMA showed the exibility of the docked protein-protein complex in which the structural changes were easily visible.MD simulation shows the spatial arrangements of atoms in the NHP2-TLR3 complex.The deformability graph measures the ability of a molecule to deform at each of its residues and is highlighted in Fig. 10(b).The B factor value shows the root mean square value and uncertainty of the NMA of PDB complex represented in Fig. 10(c).An average RMS value ranging from 0-0.8 is seen in the B factor graph.The NHP2-TLR3 complex showed an eigenvalue graph depicting the motion stiffness where the eigenvalue is directly proportional to the energy required to deform the structure.The eigenvalue of NHP2-TLR3 complex was calculated to be 8.532180e-05 as in Fig. 10(d).The lower eigenvalue represents an easier deformation property.The variance matrix graph of normal mode of the protein complex is shown in green color and individual variance is shown in red color Fig. 10(e).
The eigenvalue is inversely related to the variance associated with each normal mode.The covariance matrix graph of the NHP2-TLR3 complex shows the coupling between the protein residues as, correlated motions of residues in red color, uncorrelated motion in white color an anticorrelated motion in blue color depicted in Fig. 10(f).The elastic map of the NHP2-TLR3 complex showed the relations between the atoms using springs.The darker gray dot represents the stiffer connections and the lighter dots represent the exible connections represented in Fig. 10(g).

Discussion
Tuberculosis has been found to mostly affect people in economically backward countries.This disease does not come under the top priority drug discovery list of most of the major pharmaceutical companies.
More studies and efforts are required to eradicate the disease from those countries.Vaccine designing can be a major strategy to achieve this goal and many scientists actively pursue this goal trying different vaccine candidates.RD7 (Region of difference) is a region in the genome of M. tuberculosis which has many potential antigenic proteins.Earlier studies by our group (Sarojini and Mundayoor 2020) have found a novel 4.5 kb locus in the clinical isolates of the pathogen (Genbank accession number GU994138.2).One of the potential ORFs in that locus, NHP2, 807 bp in length with 268 amino acids and a molecular weight of 28.149 kDa was used for the current study.The protein was calculated to have a negative charge with isoelectric point 6.82 and is non-polar in nature.The high aliphatic index, 89.74, of the query protein implies its thermostability.The number of aliphatic amino acids (Alanine, Leucine, Isoleucine, Valine) of a protein has a direct correlation with its thermostability (Ikai 1980).According to the PFAM server, NHP2 belongs to the TetR protein family which are considered to be regulators of multidrug e ux pumps, where the C terminal regions in this family allows binding of a variety of inducers which can be antibiotics, organic solvents etc. Multidrug e ux pumps are ancient elements displaying different functions including resistance to heavy metals.The studies show the protein is evolutionarily active and plays a major role in binding a variety of inducers like antibiotics, organic solvents etc.Previous studies have proven that Rv0194 encodes a novel multidrug e ux pump of M. tuberculosis showing β lactam antibiotic resistance activity (Danilchanka 2008).Multi drug resistance e ux pumps in M. leprae were studied as potential tools which are important for their lifestyle and cellular growth (Machado et al 2018).Even low level expression of Rv0194 showed high resistance to multiple drugs in M. bovis BCG.Gene ontology studies have proved NHP2 as penicillin binding.The molecular and biological process of the protein were similar to PFAM studies showing that the protein plays an important role in chemical stimulation.The function of a protein can be studied by understanding the spatial arrangements of atoms in a 3-D space.X-ray crystallography and NMR spectroscopy methods are the commonly used method for determining the three-dimensional structure of a protein.Homology protein modeling is a way of determining the 3D structure of protein by comparing previously developed protein structure from the database.This modeling is very effective as it is based on comparative study of amino acids with the same structure and molecular evolution.MODELLER server is based on comparative modeling based on homology of the protein sequence.The secondary and tertiary structures of the protein were developed using different computational tools.Dynamic studies of epitope-based vaccine candidates against M. tuberculosis have been successfully done earlier against TLR3 receptors (Bibi et al 2021).NHP2-TLR3 complex simulation study data showed high stability for NHP2-TLR3 complex.iMOD simulation suggests that the NHP2-TLR3 complex is stable and can be explored further.The deformation, rigidity and stability of the docked complex showed NHP2-TLR3 binding a nity with less binding energy.The deformability graph and B factor graph shows the deformability sites of the protein.The protein studied was shown to need less amount of energy to deform since the eigenvalue developed was 8.532180e-05.The dynamic simulation studies also showed the protein to be less stiff which in turn results in the easy movement of the protein.Thus the bioinformatic approach has given proofs for NHP2 protein to be a potential TB vaccine candidate which can be further studied by in vivo experiments in animal models to better understand its e cacy and mode of action.

Conclusion
Tuberculosis continues to be a potentially fatal infection, particularly in immunocompromised patients.BCG, the only clinically authorized vaccine, is ineffective against adult-onset tuberculosis.In silico analysis was used in this study to examine the immunological and structural characteristics of the NHP2 protein in the RD7 region of M. tuberculosis.NHP2 was found to be a strong vaccine candidate because of qualities including antigenicity, non-allergenicity, non-toxicity, and non-virulence.It was also discovered to harbor epitopes that might stimulate T and B cells, and when docked with the TLR3 receptor, it displayed a high a nity for binding.The MD simulation studies also proved that the NHP2 protein is an ideal vaccine candidate based on its stability, stiffness and deformation index, which can be further validated using more in vivo and in vitro studies.Many such vaccine candidates indeed have to be screened in the quest for better TB vaccines in the current global scenario.Bond formation between NHP2-TLR3 protein complex (with amino acids and bond lengths highlighted).
al 2017), Patch Dock (https ://bioin fo3d.cs.tau.ac.il/PatchDock/php.php)(Duhovny et al 2002) and Galaxy Tong Dock (https://galaxy.seoklab.org/cgibin/submit.cgi?type=TONGDOCK_INTRO) (Park et al 2019) were used for molecular docking of the developed 3D structure of NHP2 with TLR-3.For con rmation of docking, PRODIGY Web server (Vangone and Bonvin 2015) and Hawk Dock (http://cadd.zju.edu.cn/hawkdock/) (Weng et al 2019) servers were used and the structures were re ned using Fire Dock (http://bioin fo3d.cs.tau.ac.il/FireDock/php.php)(Andrusier et al 2007; Mashiach et al 2008) server.Molecular Gene ontology-based drug designing studies against tuberculosis were studied (Passi et al 2018).It has been proven that the members of Mycobacterium species show resistance against β-lactamase antibiotics.Ampicillin / Sulbactam antibiotics tested against different species of mycobacterium have shown optimum activity against bacterial growth hinting at the possible use of β lactamase antibiotics to treat tuberculosis(Prabhakaran et al 1999).Based on the data obtained from various bioinformatics tools, NHP2 was found to be located on the outer membrane where the lipids help to attach it to the membrane.The presence of signal peptide and transmembrane helices help the protein in transportation through the membrane.Signal peptides are antigen speci c and play roles in innate and adaptive immunity, making NHP2 a suitable vaccine candidate for tuberculosis.Different computational tools predicted NHP2 to have an extracellular signal peptide with transmembrane helices.It was also found to be non-allergenic, non-toxic, antigenic and non-virulent in nature.NHP2 was predicted to show high antigenicity by inducing T and B cells.B cells play a major role in imparting immunity in the early stages of tuberculosis.B cell epitope prediction is an important characteristic in vaccine designing and in determining the antigen-antibody interaction (Jespersen et al 2019).Cytotoxic T cells play an important role in inducing immunity against any infection.Different mycobacterial secretory proteins like Antigen 85 complexes and ESAT-6 family protein produce immunity against tuberculosis by inducing T and B cells(Coppola and Ottenhoff 2018).HLA Pred server predicted the presence of ten epitopes which can bind to HLA class1 and class2 alleles.The MHC class1 binding a nity of the query protein was studied using NetCTL server and predicted 4 MHC ligands with a threshold value of 0.7 showing the immunological property of the protein.Our query protein, NHP2 showed high a nity to induce T helper cell epitopes with a good binding threshold.MHC class I cells play a major role in inducing NK cells and T cells for further immunological functions.The ability of NHP2 to induce MHC cells were studied using Class I pMHC server, where good binding a nity of NHP2 toward Class 1 MHC cells were seen.ABCPred, BcePred and BepiPred servers were used to study B epitope binding a nity of the protein.Both servers predicted NHP2 to be a good binder to induce B cells contributing to further mount an immune response.NHP2 was found to possess both B and T cell epitopes adding strength to its potential vaccine candidature.
Table showing the physical and chemical parameters of NHP2 protein of Mycobacterium tuberculosis.
NCBI CDD-BLAST studies were conducted using NHP2 query amino acid sequences.CDD BLAST studies help in identifying the conserved domain in an evolutionary context.NCBI blastx of the NHP2 nucleotide sequence showed 100% sequence similarity with different isolates of M. tuberculosis and 99% with few isolates of M. canetti, a smooth tubercle bacillus.A search for conserved domains using CDD BLAST and SMART revealed no such domains for the query protein.PFAM studies showed NHP2 to be having tetracycline repressor-like C-terminal sequence ranging from 42-104 as is shown in Fig.1.
998 were created against the TLR3 protein using the NHP2 protein as the ligand.It was shown that some NHP2 protein residues (Ser 6, Ala 26, Arg 73, Gln 127, Asp 144, Val 198, Ser 202, Asp 207, Arg 236, Gln 247, Arg 266) interact polarly with certain TLR3 receptor residues (Ala 655, His 32, Thr 59, Asp 153, Asn 230, Glu 363, Tyr 465, Arg 643, Asn 659, These amino acid residues were discovered to have binding energies ranging from 1.7 to 3.0 A°.NHP2-TLR3 was docked asymmetrically using the Tong Dock server.With a score of 1555.90, protein-protein binding was performed; a higher Tong Dock A score is regarded as a higher-quality model.A docking web server based on structural complementarity is called Patch Dock.
The transmembrane protein, TLR3 functions as a pattern recognition receptor in innate immunity and aids in the identi cation of microbial membrane components.With the aid of various technologies like Clus Pro, Patch Dock, and Galaxy Tong Dock servers, NHP2 was docked against the TLR3 receptor protein.A docking method evaluates a number of potential complexes in Clus Pro, a fully automated online server for protein-protein interactions, and the complexes with the best surface complementarity form the output.Utilizing Clus Pro, a lowest energy score of -1096.2 and a center energy score of - (Bai et al 2014)el was developed using Modeller server with 99.7% similarity and 1.1e-16 E-Value with 4TMD_A (PDB-ID) protein of M. smegmatis.The low RMSD value obtained in the study indicates a better model of the target protein.Structural and functional characterization of another protein of M. tuberculosis H37Rv, encoded by Rv0986 was earlier carried out using in silico approach (Saikat et al 2020).The 3-D structure of the protein was developed using the server Robetta keeping 4TMD_A as a template and the developed structure was re ned using Galaxy re ne Web server.Structural studies of a hypothetical protein, encoded by Rv2004c of M. tuberculosis, Oso et al 2021).The potential of NHP2 to be a vaccine was further studied by molecular docking.Toll-like receptors heightens the level of production of APCs and other important innate immune cells.TLRs as an early innate system response, recognize mycobacteria and have been found to be involved in regulating mycobacterial RNA-induced IL-10 production by way of PI3K/AKT signaling pathway(Bai et al 2014).The ability of NHP2 to bind with TLR receptors of immune (Priya et al 2013)sing Robetta server(Priya et al 2013).The developed 3D structure of NHP2 was validated using Ramachandran Plot with 84.8% residues in the most favorable region, 13% in the allowed regions and 3% in the generously allowed regions.The Verify 3D server validated the structure with 86.94% con rmation.The negative Z-score of -7.38 indicates that the 3D model of NHP2 is indeed very consistent.Structural re nement of TNFL8 protein was studied using the Galaxy web server with an output of more accurate structure (