3.1 Structural analysis
The physiochemical properties of the spike S protein, measured through protparam, showed that it contained 1162 amino acids (aa) with a molecular weight of 128046.70 kDa. The spike protein showed an antigenic nature when subjected to vaxijen v2.0.
Theoretical isoelectric point (PI) of spike protein was 7.71, indicating its positive in nature. An isoelectric point above 7 indicates the protein is charged positively. Near to 81 aa charges were found negative, whereas 84 aa found positive.
Protparam computed instability-index (II) 35.53, this categorize the protein as stable. Aliphatic-index 86.05, which devotes a thought to the proportional volume holding by aliphatic side chain and GRAVY value of the protein sequence is 0.012. Half-life of protein shown as the total time taken for its vanishing after it has been synthesized in cell, computed as 30 h for mammalian-reticulocytes, > 20 h for yeast, > 10 h for Escherichia coli. The total numbers of Carbon (C), Oxygen (O), Nitrogen (N), Hydrogen (H) and Sulfur (S) were entitled by the formula: C5737H8847N1495O1718S56
The secondary structure of IBV spike S protein was analyzed through PSIPRED and GOR IV server. The components of secondary structure prediction by GOR IV server are alpha helix (29.43%), extended strand (27.37%), beta turn (5.25%), and random coil (37.95%) (Fig.1).
DiANNA1.1 tool calculated 19 disulphides bond (S–S) positions and assign them a score and makes prediction based on trained neural system. The trans-membrane protein topology was investigated via online tool TMHMM. Residues from 1 to 1093 were found to be exposed to the surface, residue from 1094 to 1116 were found inside trans-membrane-region and residues from 1117 to 1162 were buried within the core-region of the S protein (Fig.1).
In refseq of IBV spike protein two conserved domains (Corona-S2, Corona-S2) were identified. The conserved domains were sequenced by Conserved Domain (CDD) BLAST search. The results revealed that corona-S1 (pfam01600) is the only member of the superfamily cl03276 and corona-S2 domain (pfam01601) is the only member of the superfamily cl20218. The top associated sequences in both domains were Feline infectious peritonitis virus (strain 79-1146), Avian infectious bronchitis virus (strain Beaudette), and Human coronavirus 229E while Severe acute respiratory syndrome-related coronavirus sequences were associated only with corona-S2 domain. The closest homologue obtained from BLASTP (refseq-protein) results was the Turkey coronavirus S protein with E value 0.00 followed by Murine hepatitis virus strain JHM with E value 9e-109 when comparing various coronaviruses in human and animals with IBV spike protein sequence (Table 2). Phylogenetic tree of IBV against other coronaviruses in human and animals was created based on COBALT multiple alignment see Fig. 2.
Table 2: Blastp similarity search of IBV against other coronaviruses in human and animals.
NCBI Protein ID
|
Protein Name
|
E- value
|
identity
|
YP_001941166.1
|
Turkey coronavirus
|
0.0
|
38.59%
|
YP_009194639.1
|
Camel alphacoronavirus
|
8e-126
|
34.05%
|
YP_009199242.1
|
Swine enteric coronavirus
|
4e-124
|
31.47%
|
YP_003767.1
|
Human coronavirus NL63
|
2e-121
|
34.01%
|
NP_598310.1
|
Porcine epidemic diarrhea virus
|
8e-120
|
31.36%
|
YP_009273005.1
|
Rousettus bat coronavirus
|
1e-115
|
32.86%
|
NP_058424.1
|
Transmissible gastroenteritis virus
|
2e-109
|
32.03%
|
YP_209233.1
|
Murine hepatitis virus strain JHM
|
9e-109
|
37.20%
|
YP_004070194.1
|
Feline infectious peritonitis virus
|
1e-108
|
31.70%
|
YP_003858584.1
|
Bat coronavirus BM48-31/BGR/2008
|
4e-107
|
35.69%
|
NP_828851.1
|
E2 glycoprotein precursor [Severe acute respiratory syndrome-related coronavirus]
|
5e-107
|
36.28%
|
YP_009724390.1
|
Severe acute respiratory syndrome coronavirus 2
|
1e-106
|
36.31%
|
YP_009555241.1
|
Human coronavirus OC43
|
3e-105
|
31.42%
|
YP_009047204.1
|
Middle East respiratory syndrome-related coronavirus
|
2e-104
|
34.71%
|
3.2 Multiple sequence alignment
Jalview was used to visualize the multiple sequence alignment of the retrieved sequences. Several areas in alignment were shown to have mutation see Fig. 3.
3.3 Phylogeny
Phylogenetic trees for IBV spike S proteins were constructed using MEGA7.0.26 (7170509) software using maximum likelihood parameter see Fig. 4.
3.4 B-cell Epitopes
Several epitopes were predicted in B cell prediction methods using the Bepipred Linear Epitope Prediction tool. The conservancy percentages of these epitopes are presented in table 3. After shortening of predicted epitopes, twenty one linear conserved epitopes were recognized. Of these, seven epitopes with different lengths were identified as linear, surface and antigenic epitopes between the positions 1139 – 1146 (see table 4). These epitopes were 1139KKSSYY1144, 1140KSSYYT1145, 1141SSYYTT1146, 1141SSYYT1145, 1142SYYTT1146, 1142SYYT1145, and 1143YYTT1146. Based on the length and antigenicity score, three epitopes (1139KKSSYY1144, 1140KSSYYT1145, 1141SSYYT1145) were selected as top B cell epitopes.
Discotope 2.0 server was used to predict the discontinuous epitopes from the 3D structure of S protein (PDB ID: 6CV0), 90% specificity, − 3.700 threshold and 22.000 Angstroms propensity score radius[45]. Total 30 discontinuous epitopes were recognized at different exposed surface areas (Table 5). The position of each predicted epitope on the surface of 3D structure of S protein is shown in Fig.5 using Chimera visualization tool [31].
Table 3. Conservancy assessment of B cell linear epitopes
Epitope no
|
Epitope sequence
|
Start
|
End
|
Epitope length
|
Percent of protein sequence matches at identity <= 100%
|
1
|
MTAPSSGMAW
|
83
|
92
|
10
|
89.13% (82/92)
|
2
|
GGPI
|
193
|
196
|
4
|
90.22% (83/92)
|
3
|
TGNFSD
|
235
|
240
|
6
|
97.83% (90/92)
|
4
|
GPLQGGCK
|
352
|
359
|
8
|
94.57% (87/92)
|
5
|
DSAV
|
450
|
453
|
4
|
91.30% (84/92)
|
6
|
VNPCEDV
|
488
|
494
|
7
|
96.74% (89/92)
|
7
|
RNETGSQ
|
512
|
518
|
7
|
94.57% (87/92)
|
8
|
VGQKE
|
642
|
646
|
5
|
81.52% (75/92)
|
9
|
STKPAGFNTP
|
656
|
665
|
10
|
81.52% (75/92)
|
10
|
PQNAPN
|
926
|
931
|
6
|
98.91% (91/92)
|
11
|
ANASQY
|
959
|
964
|
6
|
98.91% (91/92)
|
12
|
IVPA
|
966
|
969
|
4
|
86.96% (80/92)
|
13
|
DFDFN
|
1026
|
1030
|
5
|
84.78% (78/92)
|
14
|
SKWWNDTKHELP
|
1034
|
1045
|
12
|
94.57% (87/92)
|
15
|
GKKSSYYTT
|
1138
|
1146
|
9
|
97.83% (90/92)
|
Table 4: List of shortened B cell epitopes predicted by different B cell scale
No.
|
Peptide
|
Start
|
End
|
Length
|
Emini
|
koleskar
|
1
|
MTAP
|
83
|
86
|
4
|
0.949
|
0.966
|
2
|
GSRIQT
|
406
|
411
|
6
|
1.273
|
0.973
|
3
|
SRIQT
|
407
|
411
|
5
|
1.583
|
0.992
|
4
|
SRIQ
|
407
|
410
|
4
|
1.355
|
1.013
|
5
|
STKP
|
656
|
659
|
4
|
2.543
|
0.979
|
6
|
VGLP
|
704
|
707
|
4
|
0.398
|
1.143
|
7
|
VGLPT
|
704
|
708
|
5
|
0.465
|
1.096
|
8
|
NASQY
|
960
|
964
|
5
|
2.034
|
1.006
|
9
|
SKWW
|
1034
|
1037
|
4
|
1.26
|
0.932
|
10
|
KKSSYYTT
|
1139
|
1146
|
8
|
6.723
|
1.003
|
11
|
KSSYYTT
|
1140
|
1146
|
7
|
4.166
|
1.013
|
12
|
SSYYTT*
|
1141
|
1146
|
6
|
2.568
|
1.027
|
13
|
SYYTT*
|
1142
|
1146
|
5
|
2.359
|
1.03
|
14
|
YYTT*
|
1143
|
1146
|
4
|
1.26
|
1.035
|
15
|
KKSSYYT
|
1139
|
1145
|
7
|
5.773
|
1.016
|
16
|
KKSSYY*
|
1139
|
1144
|
6
|
4.931
|
1.034
|
17
|
KSSYYT*
|
1140
|
1145
|
6
|
3.559
|
1.031
|
18
|
KKSSY
|
1139
|
1143
|
5
|
3.875
|
1.009
|
19
|
KKSS
|
1139
|
1142
|
4
|
3.054
|
0.971
|
20
|
SSYYT*
|
1141
|
1145
|
5
|
2.191
|
1.051
|
21
|
SYYT*
|
1142
|
1145
|
4
|
2.019
|
1.061
|
*Shortened peptide that has high score in both Emini and kolaskar
Table 5. Discontinuous epitopes predicted through DISCOTOPE 2.0 Server
Residue ID
|
Residue Name
|
Contact Number
|
Propensity Score
|
Discotope Score
|
262
|
SER
|
2
|
-3.91
|
-3.69
|
263
|
VAL
|
4
|
-2.626
|
-2.784
|
264
|
ASN
|
0
|
-0.238
|
-0.211
|
265
|
THR
|
19
|
-1.418
|
-3.44
|
266
|
THR
|
5
|
0.483
|
-0.148
|
267
|
PHE
|
25
|
-0.627
|
-3.43
|
268
|
THR
|
7
|
-0.463
|
-1.215
|
387
|
GLY
|
1
|
-3.678
|
-3.37
|
414
|
GLU
|
7
|
-0.476
|
-1.226
|
415
|
PRO
|
8
|
0.187
|
-0.754
|
417
|
VAL
|
5
|
-0.324
|
-0.862
|
419
|
THR
|
6
|
1.351
|
0.506
|
420
|
ARG
|
0
|
1.529
|
1.353
|
421
|
HIS
|
11
|
0.482
|
-0.838
|
422
|
ASN
|
12
|
-2.504
|
-3.596
|
515
|
THR
|
4
|
-3.071
|
-3.178
|
531
|
GLY
|
5
|
-1.472
|
-1.877
|
532
|
THR
|
5
|
1.933
|
1.136
|
533
|
ARG
|
0
|
1.404
|
1.243
|
534
|
ARG
|
0
|
-0.425
|
-0.376
|
648
|
MET
|
5
|
-1.103
|
-1.551
|
649
|
GLU
|
16
|
-1.752
|
-3.39
|
650
|
LEU
|
26
|
-4.013
|
-6.541
|
651
|
LEU
|
10
|
-2.379
|
-3.256
|
652
|
ASN
|
12
|
-1.72
|
-2.902
|
655
|
SER
|
7
|
-2.994
|
-3.454
|
685
|
SER
|
0
|
-3.842
|
-3.4
|
741
|
ILE
|
15
|
-1.303
|
-2.878
|
893
|
GLN
|
7
|
-2.411
|
-2.939
|
896
|
GLU
|
9
|
-2.901
|
-3.602
|
3.5 Prediction of MHC class I epitopes
In this study, the Human MHC class-I HLA alleles were used to explore the interaction of epitopes with MHCI alleles as chicken MHC alleles don't exists in IEDB database. MHC-1 binding prediction tool using IEDB database expected thirteen conserved epitopes of Spike protein (S) which were interacted with many cytotoxic T cell alleles. These epitopes were 1115FFMTGCCGC1123, 590FNLTVTDEY598, 734GLLVLPPII742, 1105IIFILILGW1113, 1139KKSSYYTTF1147, 1087KTYIKWPWY1095, 166SVYLNGDLV174, 985TARDMYMPR993, 1145TTFDNDVVT1153, 983YITARDMYM991, 1144YTTFDNDVV1152, 982YYITARDMY990, 1143YYTTFDNDV1151.
3.6 Prediction of MHC class II epitopes
MHC-II binding prediction tool based on NN-align with half-maximal inhibitory concentration (IC50) ≤ 1000 was used. Thirty conserved core sequences were predicted to interact with MHCII alleles. These cores were 694EDLLFTSVE702, 1147FDNDVVTEQ1155, 1115FFMTGCCGC1123, 1116FMTGCCGCC1124, 590FNLTVTDEY598, 734GLLVLPPII742, 1105IIFILILGW1113, 902INECVKSQS910, 984ITARDMYMP992, 901KINECVKSQ909, 1139KKSSYYTTF1147, 1140KSSYYTTFD1148, 1187KTYIKWPWY1195, 735LLVLPPIIT743, 592LTVTDEYIQ600, 1014NKTVITTFV1022, 893QQRELATQK901, 894QRELATQKI902, 895RELATQKIN903, 589SFNLTVTDE597, 1141SSYYTTFDN1149, 166SVYLNGDLV174, 1142SYYTTFDND1150, 985TARDMYMPR993, 1146TFDNDVVTE1154, 593TVTDEYIQT601, 1013VNKTVITTF1021, 983YITARDMYM991, 1144YTTFDNDVV1152, 982YYITARDMY990 and 1143YYTTFDNDV1151.
3.7 Antigenicity, allergenicity and toxicity of MHCI and MHCII epitopes:
The predicted epitopes of MHCI and MHCII were subjected to VaxiJen v2.0 server, AllerJen v2.0. and ToxiPred to estimate the potential antigenicity, allergenicity and toxicity of epitopes. Five MHCI epitopes were identified as antigenic, non-allergic and non-toxic, but only three epitopes (985TARDMYMPR993, 983YITARDMYM991 and 982YYITARDMY990) showed a high linkage with MHCI alleles (Table 6). Furthermore, six MHCII epitopes were predicted to be antigenic, non-allergic and non-toxic epitopes (Table 7). However, 983YITARDMYM991 and 982YYITARDMY990 epitopes which were also presented in MHCII prediction methods, showed high antigenicity, no allergenicity and no toxicity. These epitopes were interacted with 52 and 38 alleles in MHCII see Fig. 6.
Table 6: Antigenic, non-allergic and non-toxic MHCI epitopes
Peptide
|
Start
|
End
|
Antigenicity
|
Allele
|
ic50
|
YYITARDMY*
|
982
|
990
|
0.8845
|
HLA-A*29:02
|
14.52
|
HLA-A*30:02
|
160.94
|
HLA-C*14:02
|
27.32
|
YITARDMYM*
|
983
|
991
|
0.7901
|
HLA-A*02:01
|
233.08
|
HLA-A*02:06
|
212.86
|
HLA-C*03:03
|
29
|
HLA-C*06:02
|
200.39
|
HLA-C*07:01
|
267.22
|
HLA-C*14:02
|
49.52
|
HLA-C*15:02
|
77.63
|
TARDMYMPR*
|
985
|
993
|
0.6914
|
HLA-A*30:01
|
56.23
|
HLA-A*31:01
|
14.3
|
HLA-A*68:01
|
28.24
|
IIFILILGW
|
1105
|
1113
|
0.6749
|
HLA-B*57:01
|
78.45
|
HLA-B*58:01
|
64.27
|
KKSSYYTTF
|
1139
|
1147
|
1.1865
|
HLA-A*32:01
|
182.52
|
Table 7: Antigenic, non-allergic and non-toxic MHCII epitopes
Core Sequence
|
Antigenicity
|
Peptide Sequence
|
Start
|
End
|
Allele
|
IC50
|
IIFILILGW
|
0.6914
|
IAFATIIFILILGWV
|
1100
|
1114
|
HLA-DRB1*15:01
|
454.6
|
KKSSYYTTF
|
0.6749
|
MSKCGKKSSYYTTFD
|
1134
|
1148
|
HLA-DPA1*01:03/DPB1*02:01
|
872.7
|
SKCGKKSSYYTTFD
|
1135
|
1149
|
HLA-DPA1*01/DPB1*04:01
|
408.1
|
HLA-DPA1*01:03/DPB1*02:01
|
301.5
|
HLA-DPA1*02:01/DPB1*05:01
|
953.4
|
KCGKKSSYYTTFDND
|
1136
|
1150
|
HLA-DPA1*01/DPB1*04:01
|
276.8
|
HLA-DPA1*02:01/DPB1*05:01
|
853.9
|
CGKKSSYYTTFDNDV
|
1137
|
1151
|
HLA-DPA1*02:01/DPB1*05:01
|
958.9
|
MSKCGKKSSYYTTFD
|
1134
|
1148
|
HLA-DPA1*01:03/DPB1*02:01
|
872.7
|
KSSYYTTFD
|
0.6466
|
MSKCGKKSSYYTTFD
|
1134
|
1148
|
HLA-DRB1*04:05
|
155
|
SKCGKKSSYYTTFDN
|
1135
|
1149
|
HLA-DRB1*04:05
|
125.6
|
KCGKKSSYYTTFDND
|
1136
|
1150
|
HLA-DRB1*04:05
|
92.2
|
CGKKSSYYTTFDNDV
|
1137
|
1151
|
HLA-DRB1*04:05
|
51.9
|
GKKSSYYTTFDNDVV
|
1138
|
1152
|
HLA-DRB1*04:05
|
46.9
|
KKSSYYTTFDNDVVT
|
1139
|
1153
|
HLA-DRB1*04:05
|
45.3
|
TARDMYMPR
|
0.7901
|
SYYITARDMYMPRAI
|
981
|
995
|
HLA-DRB1*03:01
|
269.3
|
YYITARDMYMPRAIT
|
982
|
996
|
HLA-DRB1*03:01
|
281.9
|
YITARDMYMPRAITA
|
983
|
997
|
HLA-DRB1*03:01
|
618.8
|
YITARDMYM
|
1.1865
|
QVNGSYYITARDMYM
|
977
|
991
|
HLA-DRB1*01:01
|
22
|
HLA-DRB1*04:01
|
145
|
HLA-DRB1*04:04
|
331.2
|
HLA-DRB1*07:01
|
20.3
|
HLA-DRB3*01:01
|
550.7
|
HLA-DRB5*01:01
|
227.8
|
VNGSYYITARDMYMP
|
978
|
992
|
HLA-DQA1*01:02/DQB1*06:02
|
338.6
|
HLA-DRB1*01:01
|
25.8
|
HLA-DRB1*03:01
|
447.6
|
HLA-DRB1*04:01
|
105.8
|
HLA-DRB1*04:04
|
248.3
|
HLA-DRB1*07:01
|
27.8
|
HLA-DRB1*15:01
|
380.6
|
HLA-DRB3*01:01
|
577.8
|
HLA-DRB5*01:01
|
198.6
|
NGSYYITARDMYMPR
|
979
|
993
|
HLA-DQA1*01:02/DQB1*06:02
|
393.3
|
HLA-DQA1*05:01/DQB1*03:01
|
817.3
|
HLA-DRB1*01:01
|
19.8
|
HLA-DRB1*03:01
|
176.5
|
HLA-DRB1*04:01
|
65.2
|
HLA-DRB1*04:04
|
225
|
HLA-DRB1*07:01
|
40.2
|
HLA-DRB1*15:01
|
291.2
|
HLA-DRB3*01:01
|
635
|
HLA-DRB5*01:01
|
93.5
|
GSYYITARDMYMPRA
|
980
|
994
|
HLA-DQA1*01:02/DQB1*06:02
|
218
|
HLA-DRB1*01:01
|
14
|
HLA-DRB1*03:01
|
197.3
|
HLA-DRB1*04:01
|
47.8
|
HLA-DRB1*04:04
|
242.4
|
HLA-DRB1*07:01
|
57.3
|
HLA-DRB1*15:01
|
288.6
|
HLA-DRB3*01:01
|
780.4
|
HLA-DRB5*01:01
|
61.4
|
SYYITARDMYMPRAI
|
981
|
995
|
HLA-DRB1*01:01
|
23.1
|
HLA-DRB1*04:01
|
65.3
|
HLA-DRB1*04:04
|
249.2
|
HLA-DRB1*04:05
|
356.4
|
HLA-DRB1*07:01
|
72.2
|
HLA-DRB1*15:01
|
284.7
|
HLA-DRB5*01:01
|
87.4
|
YYITARDMYMPRAIT
|
982
|
996
|
HLA-DRB1*01:01
|
40.8
|
HLA-DRB1*04:01
|
108.8
|
HLA-DRB1*04:04
|
269.1
|
HLA-DRB1*04:05
|
706.3
|
HLA-DRB1*07:01
|
160.6
|
HLA-DRB5*01:01
|
121.3
|
YITARDMYMPRAITA
|
983
|
997
|
HLA-DRB1*04:01
|
145.4
|
HLA-DRB1*04:04
|
652.4
|
HLA-DRB1*07:01
|
355.4
|
HLA-DRB1*08:02
|
955
|
HLA-DRB5*01:01
|
206.9
|
YYITARDMY
|
0.8845
|
IQVNGSYYITARDMY
|
976
|
990
|
HLA-DQA1*05:01/DQB1*02:01
|
491.6
|
HLA-DRB1*04:01
|
723.4
|
HLA-DRB1*04:04
|
819.7
|
HLA-DRB1*11:01
|
72
|
HLA-DRB1*11:01
|
72
|
QVNGSYYITARDMYM
|
977
|
991
|
HLA-DPA1*01/DPB1*04:01
|
710.8
|
HLA-DPA1*01:03/DPB1*02:01
|
875.8
|
HLA-DQA1*05:01/DQB1*02:01
|
292.7
|
HLA-DRB1*03:01
|
588
|
HLA-DRB1*11:01
|
32.4
|
HLA-DRB1*11:01
|
32.4
|
HLA-DPA1*01/DPB1*04:01
|
557.6
|
HLA-DPA1*01:03/DPB1*02:01
|
860.8
|
HLA-DQA1*05:01/DQB1*02:01
|
311.9
|
HLA-DRB1*11:01
|
17.9
|
HLA-DRB1*11:01
|
17.9
|
NGSYYITARDMYMPR
|
979
|
993
|
HLA-DPA1*01/DPB1*04:01
|
503
|
HLA-DPA1*01:03/DPB1*02:01
|
763.2
|
HLA-DQA1*05:01/DQB1*02:01
|
387.6
|
HLA-DRB1*09:01
|
858.7
|
HLA-DRB1*09:01
|
858.7
|
HLA-DRB1*11:01
|
11
|
HLA-DRB1*11:01
|
11
|
GSYYITARDMYMPRA
|
980
|
994
|
HLA-DPA1*01/DPB1*04:01
|
504.5
|
HLA-DPA1*01:03/DPB1*02:01
|
790.4
|
HLA-DQA1*05:01/DQB1*02:01
|
482.9
|
HLA-DRB1*11:01
|
15.2
|
HLA-DRB1*11:01
|
15.2
|
SYYITARDMYMPRAI
|
981
|
995
|
HLA-DPA1*01/DPB1*04:01
|
480
|
HLA-DPA1*01:03/DPB1*02:01
|
733.8
|
HLA-DQA1*05:01/DQB1*02:01
|
526.3
|
HLA-DRB1*11:01
|
26.8
|
HLA-DRB1*11:01
|
26.8
|
YYITARDMYMPRAIT
|
982
|
996
|
HLA-DPA1*01/DPB1*04:01
|
705.6
|
HLA-DPA1*01:03/DPB1*02:01
|
931.5
|
HLA-DQA1*05:01/DQB1*02:01
|
678.7
|
HLA-DRB1*11:01
|
51.8
|
HLA-DRB1*11:01
|
51.8
|
3.8 Molecular Docking
The molecular docking was achieved by docking MHCI epitopes with chicken BF alleles (BF2 * 2101 & BF2 * 0401) using peptide-binding groove affinity. The chicken alleles were used as receptors, and the top MHCI epitopes 982YYITARDMY990, 983YITARDMYM991 and 985TARDMYMPR993 were used as ligands. Docking of 983YITARDMYM991 epitope with BF2*2101 and BF2*0401 alleles showed – 72.11 and – 37.39 global energy respectively, indicating a strong binding affinity between the ligands and the receptors compared to other epitopes (Fig. 7, 8 and 9). In general, the global binding affinity of ligands with the receptor BF2*2101 alleles was found to be lower compared to BF2*0401, suggesting strong receptor-ligand interaction.