3.1. CR3022CDR prediction
Paratome is a bioinformatic tool for the recognition of ABRs in antibodies This server predicts 3 regions as ABRs in CR3022 heavy chain and 3 regions as ABRs in CR3022 light chain. These regions include YGFITYWIG (27-35) as ABR1 and WMGIIYPGDSETRY (47-60) and GGSGISTPMDV (98-108) as ABR2 and 3 respectively in CR3022 heavy chain and QSVLYSSINKNYLA (27-40), LLIYWASTRES (52-62) and QQYYSTPY (95-102) as ABR1, ABR2 and ABR3 respectively in CR3022 light chain. Paratome results are displayed in Table 1.
3.2. CR302 conservation of amino acid positions evolution
The nine-color conservation scores are being projected onto the three-dimensional structure of the Ab and coloured protein structure is displayed by FirstGlance in Jmol (Figure 1). The normalized score calculated for each amino acid position calculated by the Consurf server. The color scale represented by the conservation scores (9 - conserved, 1 - variable) is shown in Table 1.
Figure 1: CR302 conservation of amino acid positions evolution by the Consurf server. conservation scores projected onto the three-dimensional structure of the Ab with nine colours. The schematic structure of the coloured protein displayed by FirstGlance in Jmol.
Table 1: Paratome, predUs, cons-PPISP, GHECOM, WESA and Consurf server Predictions
Chain: Residue
|
H:55
|
H:56
|
H:57
|
H:58
|
H:59
|
H:60
|
H:98
|
H:99
|
H:100
|
H:101
|
H:102
|
H:103
|
H:104
|
H:105
|
H:106
|
H:107
|
H:108
|
Paratome
|
D
|
S
|
E
|
T
|
R
|
Y
|
G
|
G
|
S
|
G
|
I
|
S
|
T
|
P
|
M
|
D
|
V
|
Pred us
|
D
|
S
|
E
|
T
|
R
|
Y
|
G
|
G
|
S
|
G
|
I
|
S
|
P
|
M
|
T
|
D
|
V
|
Cons ppisp
|
0.073
|
0.084
|
0.084
|
0.067
|
0.065
|
0.09
|
0
|
0
|
0.069
|
0.044
|
0.044
|
0.036
|
0.04
|
0
|
0
|
0.107
|
0
|
Consurf
|
1
|
1
|
1
|
3
|
1
|
8
|
4
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
2
|
1
|
Ghecom
|
0.43
|
0
|
0
|
0
|
1.26
|
0.02
|
0
|
0.1
|
0.89
|
0.46
|
1.61
|
4.2
|
3.6
|
0
|
0
|
8.08
|
6.93
|
WESA
|
1
|
1
|
1
|
1
|
1
|
0
|
0
|
1
|
1
|
1
|
1
|
1
|
1
|
0
|
0
|
0
|
0
|
BIPSPI
|
0.4305
|
0.3976
|
0.2848
|
0.3277
|
0.2426
|
0.1917
|
0.6202
|
0.5643
|
0.4652
|
0.2215
|
0.2111
|
nan
|
nan
|
nan
|
nan
|
nan
|
Nan
|
Meta-PPISP
|
N
|
N
|
N
|
N
|
N
|
N
|
-
|
-
|
P
|
P
|
N
|
P
|
P
|
-
|
-
|
N
|
-
|
Chain: Residue
|
L:27
|
L:28
|
L:29
|
L:30
|
L:31
|
L:32
|
L:33
|
L:34
|
L:35
|
L:36
|
L:37
|
L:38
|
L:39
|
L:40
|
L:52
|
L:53
|
L:54
|
Paratome
|
Q
|
S
|
V
|
L
|
Y
|
S
|
S
|
I
|
N
|
K
|
N
|
Y
|
L
|
A
|
L
|
L
|
I
|
Pred us
|
Q
|
Y
|
V
|
S
|
L
|
S
|
S
|
I
|
N
|
K
|
N
|
Y
|
L
|
A
|
L
|
L
|
I
|
Cons ppisp
|
0.02
|
0.077
|
0
|
0.065
|
0.063
|
0.058
|
0.048
|
0.059
|
0.072
|
0.05
|
0.051
|
0.039
|
0
|
0
|
0
|
0
|
0
|
Consurf
|
1
|
1
|
2
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
5
|
7
|
Ghecom
|
0.15
|
0
|
0.29
|
1.69
|
1.07
|
0
|
0
|
0.09
|
2.41
|
1.92
|
3.42
|
1.53
|
0.02
|
0
|
4.81
|
2.32
|
0
|
WESA
|
1
|
1
|
0
|
0
|
1
|
1
|
1
|
1
|
1
|
1
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
BIPSPI
|
0.2704
|
0.6071
|
0.4767
|
0.4071
|
0.2915
|
0.3562
|
0.2482
|
nan
|
nan
|
nan
|
nan
|
nan
|
nan
|
nan
|
0.2368
|
0.2901
|
0.2331
|
Meta-PPISP
|
N
|
N
|
-
|
P
|
P
|
P
|
P
|
P
|
P
|
P
|
P
|
P
|
-
|
-
|
-
|
-
|
-
|
Chain: Residue
|
H:27
|
H:28
|
H:29
|
H:30
|
H:31
|
H:32
|
H:33
|
H:34
|
H:35
|
H:47
|
H:48
|
H:49
|
H:50
|
H:51
|
H:52
|
H:53
|
H:54
|
Paratome
|
Y
|
G
|
F
|
I
|
T
|
Y
|
W
|
I
|
G
|
W
|
M
|
G
|
I
|
I
|
Y
|
P
|
G
|
Pred us
|
Y
|
G
|
F
|
I
|
T
|
Y
|
W
|
I
|
G
|
W
|
M
|
G
|
I
|
I
|
Y
|
P
|
G
|
Cons ppisp
|
0.14
|
0.124
|
0
|
0.081
|
0.112
|
0.099
|
0.068
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0.063
|
0
|
0.085
|
Consurf
|
5
|
3
|
6
|
4
|
1
|
1
|
1
|
4
|
1
|
8
|
5
|
3
|
1
|
7
|
1
|
1
|
1
|
Ghecom
|
0.37
|
0.62
|
2.76
|
2.99
|
0.88
|
1.32
|
0
|
0
|
0
|
3.08
|
1.69
|
0
|
0.02
|
0
|
0.82
|
2.83
|
3.07
|
WESA
|
0
|
0
|
0
|
1
|
1
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
1
|
BIPSPI
|
0.417
|
0.5276
|
0.5765
|
0.6946
|
1
|
0.6402
|
0.6246
|
0.4596
|
nan
|
nan
|
nan
|
nan
|
0.2091
|
0.3352
|
0.5082
|
0.6796
|
0.6202
|
Meta-PPISP
|
P
|
P
|
-
|
P
|
P
|
P
|
P
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
P
|
-
|
P
|
Chain: Residue
|
L:55
|
L:56
|
L:57
|
L:58
|
L:59
|
L:60
|
L:61
|
L:62
|
L:95
|
L:96
|
L:97
|
L:98
|
L:99
|
L:100
|
L:101
|
L:102
|
Paratome
|
Y
|
W
|
A
|
S
|
T
|
R
|
E
|
S
|
Q
|
Q
|
Y
|
Y
|
S
|
T
|
P
|
Y
|
Pred us
|
Y
|
W
|
A
|
S
|
T
|
R
|
E
|
S
|
Q
|
Q
|
Y
|
Y
|
S
|
T
|
P
|
Y
|
Cons ppisp
|
0.053
|
0.044
|
0
|
0.034
|
0.056
|
0.062
|
0.061
|
0.086
|
0
|
0
|
0
|
0.039
|
0.047
|
0.063
|
0.098
|
0
|
Consurf
|
3
|
1
|
1
|
1
|
1
|
5
|
1
|
7
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
Ghecom
|
1.94
|
3.06
|
1.25
|
1.41
|
0.15
|
0.04
|
6.6
|
3.78
|
0
|
0.06
|
2.31
|
1.44
|
2.03
|
1.86
|
2.82
|
3.51
|
WESA
|
0
|
1
|
0
|
1
|
1
|
1
|
1
|
1
|
0
|
0
|
0
|
1
|
1
|
1
|
0
|
1
|
BIPSPI
|
0.2802
|
0.2489
|
0.1975
|
nan
|
nan
|
nan
|
nan
|
nan
|
0.2459
|
0.2307
|
0.1949
|
nan
|
nan
|
nan
|
nan
|
nan
|
Meta-PPISP
|
P
|
P
|
-
|
P
|
P
|
N
|
N
|
N
|
-
|
-
|
-
|
P
|
N
|
N
|
N
|
-
|
3.3. CR3022Interfaces prediction
Potential interfacial residues identified through PredUs are presented in table 1. Residue 33W in ABRs I of a heavy chain (H chain ABR I), residue W47, I50, R59 and Y60 in ABRs II of a heavy chain (H chain ABR II), residue I102, S103 and D107 in ABRs III of a heavy chain (H chain ABR III), Residue Q27 in ABRs I of a light chain (L chain ABR I), residue L52, Y55, W56 and S58-E61 in ABRs II of a light chain (L chain ABR II) and residue Y97-S99, P101 and Y102 in ABRs III of a light chain predicted as a possible interfacial residue.
cons-PPISP calculates a score of neural network for every residue. This score estimated if a residue involved in the protein-protein interaction location or not. The interred and not predicted residues presume as 0 score. The score above 0 considers as an interaction residue and below 0 score non-interaction residue (Table 1).
WESA a Weighted Ensemble Solvent Accessibility predictor identified a number of AAs as Solvent Accessible residues. WESA entire results are shown in Table 1. WESA computes six different scores for each residue. These scores include Bayesian Statistics )BS), Multiple Linear Regression (MLR), Decision Tree (DT), Neural Network (NN), Support Vector Machine (SVM) and Weighted Ensemble (WE). The WE score predicts whether a residue is exposed (=1) or Buried (=0).
BIPSPI can be used to predict partner-specific protein-protein interfaces by utilizing sequences or structural patterns. An individual threshold for antigen (Ag) and Ab predictions can be set. These thresholds are suitable in order to examine various expected accuracy/recall values. A prediction of the likely accuracy at the set of thresholds is presented as a threshold. Residues whose score predicted accuracy is more or equal than the precision threshold (0.500) are highlighted in green. The interactive visualization of predicted residues in the Ab structure is shown in Figure 2. The predicted interface scores for the residues of Ab is displayed in Table 1
Figure 2. BIPSPI Interactive visualization of predicted residues in the antibody structure.
Residues whose score has an expected precision greater or equal than the precision threshold (0.500) are highlighted in green. Interface Residues Prediction by BIPSPI whose score has an expected precision greater or equal than the precision threshold (0.500) are listed below the picture.
3.4. CR3022 binding sites and Pocket detection
GHECOM server discovers five pockets on protein shells by utilizing mathematical morphology. In this respect, it calculates a pockets score (sum of 1/[Rpocket] /(1/[Rmin]*[vol of shell])) for each residue. A residue in a deeper and bigger pocket has a greater value of pockets. The pockets of small-molecule binding and active locations were greater than the be an average of value; in particular, the values for the active locations were much greater. This implies that pockets are contributing to the prediction of binding and active sites from protein structures (Figure 3) (Table 1)[24]. Residue H:S103, H:D107, H:V108, L:L52 and L:E61 have GHECOM score above 4.
Meta-PPISP is a metamethod for Protein-Protein Interaction Site Prediction. Meta-PPISP is created on three individual web servers including PINUP, cons-PPISP, and Promate. The interface residues which predicted are shown as residue ID: chain ID)
1:H, 3:H, 25:H, 26:H, 27:H, 28:H, 30:H, 31:H, 32:H, 33:H, 52:H, 53:H, 73:H, 76:H, 96:H, 97:H, 99:H, 100:H, 27C:L, 27D:L, 27E:L, 27F:L, 28:L, 29:L, 30:L, 31:L, 32:L, 49:L, 50:L, 52:L, 53:L, 92:L
Meta-PPISP calculated cons-PPISP, PINUP, Promate, and meta-PPISP scores.
In this regard Meta-PPISP was Predicted whether the residue is in an interface (P = Positive; N = Negative; - = Buried and not predicted) Note: P corresponds to a score > 0.34 in meta-PPISP score. Meta-PPISP entire results are shown in Table 1.
Figure 3: GHECOM results showing Jmol view of a pocket structure. Jmol view of pocket structure based on pockets colour.
Figure 4. Illustration of mutations sequences
3.5. Significant residues selection
We select H:30I, H:31T, H:33W, H:54G, H:103S, L:56W, L:58S, L:59T, L:61E and L:98Y residues by using the results of various software. These residues situated in one of three CDR regions were predicted by the Paratome. The specially selected residues confirm by at least four software. Cons PPISP score above 0.00, the scores above 0.5 and 4 for BIPSPI and GHecom were the considered threshold, respectively.
In this regard, PredUS, Meta-PPISP and WESA predicted residues research to select the significant AAs.
Residue H:30I was predicted by Cons PPISP, WESA, BIPSPI and Meta-PPISP servers. Residue H:31T was predicted by Cons PPISP, WESA, BIPSPI and Meta-PPISP servers. Residue H:33W was predicted by Cons PPISP, PredUs, BIPSPI and Meta-PPISP servers. Residue H:54G was predicted by Cons PPISP, WESA, BIPSPI and Meta-PPISP servers. Residue H:103S was predicted by Cons PPISP, PredUs, Ghecom, WESA and Meta-PPISP servers. Residue L:56W was predicted by Cons PPISP, PredUs, WESA and Meta-PPISP servers. Residue L:58S was predicted by Cons PPISP, PredUs, WESA and Meta-PPISP servers. Residue L:59T was predicted by Cons PPISP, PredUs, WESA and Meta-PPISP servers. Residue L:61E was predicted by Cons PPISP, PredUs, WESA and Ghecom servers. Residue L:98Y was predicted by Cons PPISP, PredUs, WESA and Meta-PPISP servers.
3.6. SIFT analyses
Sorts intolerant from tolerant (SIFT) is a sequence homology-based program that amino acid switches and predicts this change in protein will have a phenotypic impact. SIFT assumes that protein development is associated with protein function. Positions that are essential for the function should be conserved in an alignment of the protein family, although insignificant positions should be displayed diverse in an alignment.
SIFT brings a query sequence and utilizes multiple alignment information that will predict tolerated and detrimental alterations in each position of the query sequence. SIFT is a multistep process that (1) searches for analogous sequences, (2) selects closely linked sequences that may share parallel function to the query sequence, (3) obtains the alignment of these chosen sequences, and (4) computes normalized possibilities for all changes from the alignment. Positions with normalized possibilities lower than 0.05 are expected to be deleterious; those higher than or equal to 0.05 are expected to be tolerated (Table 2).
Table 2. SIFT results for selected residues
Innate features of Amino acid (AA) showed in different colours including black: nonpolar, green: uncharged polar, red: basic, blue: acidic. Initial capitals imply AAs showing in the alignment, lower case letters arise from prediction.
Position
|
Seq Rep
|
Predict Tolerated
|
H:30I
|
0.89
|
H
|
p
|
l
|
V
|
R
|
q
|
N
|
D
|
G
|
I
|
A
|
k
|
E
|
T
|
S
|
H:31T
|
0.89
|
h
|
i
|
v
|
p
|
l
|
q
|
A
|
K
|
R
|
E
|
G
|
N
|
D
|
T
|
S
|
H:33W
|
0.89
|
C
|
p
|
m
|
D
|
E
|
q
|
W
|
K
|
N
|
R
|
G
|
i
|
T
|
S
|
V
|
A
|
H
|
L
|
F
|
Y
|
H:54G
|
0.85
|
c
|
p
|
M
|
W
|
Q
|
E
|
i
|
R
|
K
|
T
|
D
|
A
|
H
|
F
|
V
|
L
|
N
|
S
|
Y
|
G
|
H:55D
|
0.84
|
m
|
h
|
F
|
i
|
P
|
L
|
V
|
r
|
q
|
T
|
A
|
k
|
E
|
G
|
S
|
N
|
D
|
|
|
|
H:57E
|
0.81
|
c
|
F
|
Y
|
m
|
h
|
I
|
P
|
V
|
L
|
G
|
N
|
R
|
Q
|
T
|
D
|
S
|
A
|
K
|
E
|
|
H:102I
|
0.56
|
c
|
W
|
P
|
D
|
m
|
E
|
K
|
q
|
N
|
G
|
R
|
T
|
S
|
I
|
A
|
V
|
H
|
L
|
F
|
Y
|
H:103S
|
0.65
|
w
|
C
|
m
|
P
|
D
|
I
|
q
|
N
|
G
|
r
|
K
|
h
|
E
|
T
|
V
|
F
|
L
|
S
|
A
|
Y
|
L:56W
|
1.00
|
c
|
p
|
m
|
D
|
Q
|
N
|
W
|
K
|
i
|
E
|
G
|
R
|
T
|
S
|
V
|
H
|
A
|
L
|
F
|
Y
|
L:58S
|
1.00
|
N
|
S
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
L:59T
|
0.99
|
f
|
c
|
Y
|
m
|
h
|
I
|
p
|
v
|
L
|
g
|
D
|
Q
|
N
|
A
|
S
|
R
|
K
|
E
|
T
|
|
L:61E
|
0.98
|
l
|
G
|
T
|
V
|
S
|
K
|
Q
|
H
|
D
|
Y
|
P
|
A
|
E
|
|
|
|
|
|
|
|
L:98Y
|
0.93
|
c
|
W
|
p
|
D
|
m
|
E
|
K
|
Q
|
N
|
G
|
R
|
I
|
T
|
S
|
V
|
A
|
H
|
L
|
F
|
Y
|
Figure 5. HADDOCK Score of all variants. The variants with scores below the threshold (control score is -138.4) are predicted to have an enhanced affinity toward the CR3022 human Ab. The 5-best mutated CR3022 human Ab variants coloured in red on the graph.
3.7. CR3022variants sketching
71 variants including mutations in at least one of 3 ABRs were offered. H:30I, H:31T, H:33W, H:54G, H:103S, L:56W, L:58S, L:59T, L:61E and L:98Y residues which were established by various programs mutated in proposed variants at random. In addition H:30I, H:33W, H:57E, H:55D, H:102I and L:56W were introduced by Meng Yuan et al. (27) as interactive residues in Crystal structure of CR3022 in complex with SARS-CoV-2 RBD. In this regard, we also mutated H:57E, H:55D and H:102I to access a more diverse variant. Mutation sequences were aligned and illustrated in Figure 4.
3.8. protein-protein docking is based upon biochemical or biophysical data
HADDOCK server evaluates ligand and receptor integration based on biochemical and/or biophysical data. Table 3 represents the information of variants with the HADDOCK score is more than the control. The Van der Waals and electrostatic energy values, in addition to the interred surface between the two complexes, are shown.
Table 3- Docking between the normal and 5 best mutated CR3022 human antibody variants with RBD Ag
Control
|
variant45
|
variant60
|
variant67
|
variant69
|
variant71
|
HADDOCK score
|
-138.4 +/- 1.6
|
-167.3 +/- 3.2
|
-167.5 +/- 6.5
|
-161.6 +/- 4.0
|
-173.0 +/- 6.0
|
-169.8 +/- 7.0
|
Cluster size
|
71
|
40
|
47
|
48
|
32
|
39
|
RMSD from the overall lowest-energy structure
|
1.0 +/- 0.9
|
3.2 +/- 0.3
|
1.2 +/- 1.0
|
0.9 +/- 0.5
|
1.8 +/- 1.1
|
3.3 +/- 0.0
|
Van der Waals energy
|
-54.5 +/- 8.7
|
-68.9 +/- 5.6
|
-64.4 +/- 0.8
|
-80.1 +/- 5.3
|
-63.8 +/- 10.4
|
-63.3 +/- 4.1
|
Electrostatic energy
|
-341.7 +/- 53.7
|
-338.4 +/- 12.1
|
-466.5 +/- 19.9
|
-247.3 +/- 27.7
|
-431.1 +/- 39.3
|
-435.3 +/- 43.7
|
Desolvation energy
|
-21.9 +/- 6.9
|
-35.6 +/- 3.3
|
-14.5 +/- 3.2
|
-38.8 +/- 6.7
|
-27.5 +/- 7.5
|
-21.6 +/- 4.6
|
Restraints violation energy
|
63.7 +/- 15.68
|
49.1 +/- 18.23
|
46.7 +/- 28.23
|
67.1 +/- 16.10
|
45.4 +/- 36.81
|
22.3 +/- 20.17
|
Buried Surface Area
|
1809.7 +/- 119.0
|
2242.2 +/- 46.5
|
2221.8 +/- 77.1
|
2335.3 +/- 104.3
|
2136.4 +/- 99.7
|
2165.1 +/- 105.8
|
Z-Score
|
-1.7
|
-1.5
|
-1.2
|
-2.2
|
-1.7
|
-1.6
|
The categorization of complex structures is based on HADDOCK scores. HADDOCK provides a fully flexible scoring plan since the weight of the different power terms can be determined separately for each stage of the docking. The scoring is achieved according to the weighted sum (HADDOCK score) of the following terms: such as electrostatic, van der Waals, a radius of gyration restraint, distance restraints, intervention projection angle restraints, direct RDC restraint, pseudo contact shift restraint, dihedral angle restraints, diffusion anisotropy, symmetry restraints, binding and desolvation energy. The structure with the lowest weighted sum will be classified top.