A phylogenetic tree showed the evolutionary relationship of TolC protein amino acid sequences of rickettsial species (n = 76) (Fig. 1). In the dendrogram, each species clustered together and the amino acid sequences of rickettsial TolC protein that belong to spotted fever group, typhus group and scrub typhus group clustered separately. The branching pattern indicated the homologous sequence relatedness of TolC protein within the rickettsial groups. TolC protein sequence of O. tsutsugamushi was highly distinct from sequences of other rickettsial species and was excluded for epitope prediction.
A consensus amino acid sequence of TolC protein was identified for each rickettsial species. These sequences were used to identify immunodominant B-cell and T-cell candidate peptide epitopes. The BepiPred prediction program resulted in six to eight different B-cell peptides for each rickettsial species. Among these, epitopes that are highly immunodominant in terms of prediction scores, with peptide length ranging from 15- to 30- mer, with at least 50% exposed residues and conserved across rickettsial groups or species were selected for further analysis. Based on these criteria, three B-cell candidate peptide epitopes were selected and are shown in Table 1. Among the three peptide epitopes, epitope IYPEGGAQYSRIRSAKNQTRNSA/VVQ had the high Vaxijen score of 0.93 indicating that the selected candidate peptide epitope is a protective antigen and best suitable as a candidate vaccine. The epitope was conserved within both typhus group and spotted fever group of rickettsial species with one amino acid residue change (A330V).
Table 1
Selected B cell peptide epitopes conserved across Rickettsial group of species.
Peptide
|
Position
|
Peptide length
|
Exposed residues (%)
|
Vaxijen Score
|
Conserved across species
|
LQSGRTYYNPQGDINAINNR
|
277-296
|
20
|
70
|
0.5573
|
R. rickettsii; R. sibrica
|
IYPEGGAQYSRIRSAKNQTRNSA/VVQ
|
308-332
|
25
|
68
|
0.9298
|
R. rickettsii; R. sibrica; R. conorii; R. typhi; R. japonica; R. prowazekii
|
GELTAQSLKLKVKYFSPEEEFKTIKKKM
|
423-450
|
28
|
78.57
|
0.7218
|
R. akari; R. australis; R. conorii; R. rickettsii
|
The 3D structure of TolC protein predicted from the consensus sequence of rickettsia was of acceptable quality with more than 85% of amino acid residues present in the most favored region (Fig. 2). The chosen epitope was highlighted in the 3D protein structure (Fig. 3). The predicted structure had a TM-score of 0.76 ± 0.10 and C-score of 0.35. C-score ranges between − 5 and 2, a C-score of higher value signifies a high confidence model.
For the prediction of MHC class-I binding promiscuous T-cell epitopes (9 mers), we used both NetMHCpan 4.0 and IEDB Analysis Resource program servers with HLA supertypes. The T-cell peptide epitopes of 9-mer in length that are predicted by both the programs and that had TAP binding efficiency and survived proteasomal cleavage were selected for further analysis. These peptides were conserved across different rickettsial groups and species. The level of conservation across rickettsial species among the consensus sequences was shown in Fig. 4. Based on these criteria, three peptides YNKKYVNRL, SLKLKVKYF and KLYEAKITR were selected and are shown in Table 2.
Table 2
Selected MHC Class I T- cell peptide epitopes conserved across Rickettsial group of species.
Peptide
|
Conserved across species
|
HLA alleles
|
KLYEAKITR
|
R. typhi; R. conorii; R. sibirica; R. akari; R. rickettsii; R. australis; R. japonica
|
HLA-A*02:01; HLA-A*31:01; HLA-A*03:01; HLA-A*11:01; HLA-A*33:01; HLA-A*68:01; HLA-A*02:01; HLA-A*32:01; HLA-A*30:01; HLA-A*02:06; HLA-A*02:03; HLA-B*57:01; HLA-B*15:01; HLA-B*40:01; HLA-B*58:01; HLA-B*51:01; HLA-A*23:01; HLA-B*07:02; HLA-A*24:02; HLA-A*30:02; HLA-A*26:01; HLA-B*35:01; HLA-A*01:01; HLA-B*44:02; HLA-B*44:03; HLA-B*53:01; HLA-A*68:02; HLA-B*08:01
|
SLKLKVKYF
|
R. typhi; R. conorii; R. sibirica; R. akari; R. rickettsii; R. australis; R. japonica; R. prowazekii
|
HLA-B*0801; HLA-B*08:01; HLA-B*15:01; HLA-A*26:01; HLA-A*32:01; HLA-B*44:02; HLA-A*23:01; HLA-A*02:03; HLA-A*24:02; HLA-A*31:01; HLA-B*57:01; HLA-A*03:01; HLA-A*02:01; HLA-A*30:01; HLA-A*11:01; HLA-A*01:01; HLA-B*44:03; HLA-A*33:01; HLA-B*07:02; HLA-B*58:01; HLA-A*30:02; HLA-A*68:01; HLA-A*02:06; HLA-B*35:01; HLA-B*40:01; HLA-B*53:01; HLA-B*51:01; HLA-A*68:02
|
YNKKYVNRL
|
R. typhi; R. conorii; R. sibirica; R. akari; R. rickettsii; R. australis; R. japonica; R. prowazekii
|
HLA-A*24:02; HLA-B*08:01; HLA-A*23:01; HLA-A*24:02
|
The three peptide epitopes were subjected to docking using CABS-dock program. The epitopes were each docked with a representative HLA allele to which it was predicted to bind with high affinity by the two programs. The scores of docking peptide epitopes with respective HLA alleles are shown in Table 3. The binding of peptide epitopes within the respective HLA allele 3D protein model is shown in Fig. 5. The peptide YNKKYVNRL had a comparatively high RMSD score in the docking experiment. However, the predicted binding to the number of HLA alleles were very small compared to other epitopes. Therefore, epitope KLYEAKITR that had a good RMSD score, as well as promiscuous binding to 28 different HLA allele supertypes, was considered suitable and selected for molecular dynamics simulation experiment.
Table 3
Docking score of MHC-I binding protein with their respective peptides
Peptide Sequence
|
PDB ID
|
Docked Protein
|
CABS-dock RMSD Score
|
KLYEAKITR
|
3UTQ
|
HLA-A*0201
|
4.73
|
SLKLKVKYF
|
4QRQ
|
HLA B*0801
|
1.13
|
YNKKYVNRL
|
4F7T
|
HLA-A*2402
|
8.56
|
The epitope KLYEAKITR was subjected to molecular dynamic simulation to analyze the stability of the protein-peptide complex. A 50-nanosecond (ns) simulation was performed to the protein and protein-peptide complex, RMSD was generated to understand the stability of the protein-peptide complexes. The RMSD of the protein-peptide complex were calculated for backbone atoms (Cα) from the initial structure was computed and plotted in Fig. 6. The calculated RMSD for the protein-peptide complex was found to be ~ 0.4Å. System was converged from 35ns till 50ns, indicating that the complex was more stable after 35ns.
To check the conformational changes of protein-peptide complex, hydrogen bonds were predicted through intermolecular H-bonds. The number of hydrogen bond count was around 300 and was observed to be stable during the simulation. Solvent Accessible Surface Area (SASA) is used to calculate the accessible surface of the protein molecule. A few orientation changes were observed between the protein-peptide complex from ~ 225nm to ~ 185nm. Rg analysis indicated a slight change in the complex molecule (value of ~ 2.31nm) during the simulation indicating the protein-peptide compactness (Fig. 7).
Among the class-II MHC epitopes predicted by the program, epitope AFSGFMPSVGLQINR was found to be highly conserved across rickettsial species (Table 4). The epitope also had a very low adjusted rank indicating its good binding ability with the MHC molecule.
Table 4
Selected MHC Class-II T- cell peptide epitopes
Rickettsial Species
|
MHC allele
|
Predicted MHC-II binding epitopes
|
Adjusted rank
|
R. rickettsii
|
HLA-DRB1*07:01
|
AFSGFMPSVGLQINR
PRAFSGFMPSVGLQI
RAFSGFMPSVGLQIN
|
0.21
|
R. typhi
|
HLA-DRB3*02:02
|
AFSGFMPNVGLQINR
FSGFMPNVGLQINRQ
|
0.05
|
R. akari
|
HLA-DRB1*07:01
|
VSVWEGFEAAKSRIV
|
0.28
|
R. australis
|
HLA-DRB1*07:01
|
VSVWEGFEAAKSRIV
|
0.28
|
R. japonica
|
HLA-DRB3*02:02
|
AFSGFMPNVGLQINR
FSGFMPNVGLQINRQ
|
0.05
|
R. prowazekii
|
HLA-DRB3*02:02
|
AFSGFMPNVGLQINR
|
0.05
|
R. sibirica
|
HLA-DRB1*07:01
|
AFSGFMPSVGLQINR
PRAFSGFMPSVGLQI
RAFSGFMPSVGLQIN
|
0.21
|