3.1. Structure prediction
CXCR4 has a remarkable capacity to recognise a variety of proteins, peptides, and tiny compounds that are unrelated to one another. Due to receptor conformational plasticity involving changes to the receptor side-chain and backbone, the ligands occupy various areas of the binding pocket while still binding to a conserved set of binding determinants. Due to its adaptability, the receptor may accept ligands from several classes, including allosteric inhibitors and chemokines of the CC and CXC types. The increasing diversity of ligands for chemokine receptors creates opportunities for the rational design of ligands with better inhibitory profiles and mechanisms of action [26].
The sequence of CXCR4 protein with Uniprot ID P61073 has 352 amino acids. A 38-residue, sulfotyrosine-containing peptide generated from the CXCR4 N-terminus was complexed with CXCL12 to form its NMR structure (PDB ID: 2K05) [27] but the full-length experimental structure is not available. Therefore, we have modelled the protein. The tertiary structure of CXCR4 obtained from GalaxyWEB-TBM (Fig. 1), indicating a low confidence level. The Ramachandran plot for these structures showed more disallowed regions than the raw structure. Since these scores are not considered suitable for docking.
The specific portions of the protein were further refined using the GalaxyWEB-refine was obtained. The optimised structure was chosen based on the highly preferred observations green crosses: 316 (98.442%) score in the ramachandran plot server (Supplementary Fig. 1). Superimposing the raw and refined structure revealed a root mean square deviation (RMSD) of 0.3. Positions of alpha helix differed a bit between the two structures (Fig. 2).
3.2. Druggable site identification
The drug targeting site can be an active or allosteric site that influences protein function. Putative pockets for ligand interaction were identified by CASTp analysis, which identified forty-six pockets. Pocket 1 was chosen for docking as it possessed the largest surface area and volume (Fig. 3). The three residues namely D97, D187 and E288 plays an important role in CXCR4 signalling [28]. The identification of inhibitors binding site in the crystal structure of CXCR4 (PDB ID: 3OE8) has been reported [29]. Interproscan revealed that a portion (residue 6–37) contains the chemokine receptor 4 N-terminal domain (IPR022726) and (residue 55–302) GPCR, rhodopsin-like, 7TM (IPR017452) domain. Interestingly, this domain has been implicated in programmed cell death [30].
3.3. Screening of anti-inflammatory molecules
CXCR4 is engaged in the persistent inflammation of the arterial wall, which is characterised by a chemokine-mediated influx of leukocytes, in the aetiology of atherosclerosis [31]. Additionally, CXCR4 has been found as a significant factor in the development of aneurysms, atherosclerotic plaque instability, and vascular remodelling following injury. Additionally, persistent inflammation and localised immune cell infiltration with CXCR4 expression substantially encourage the development of esophageal cancer [32]. CXCR4 overexpression has been discovered to significantly contribute to renal injury and neurodegenerative illnesses in addition to its involvement in a number of inflammation-related processes [33].
Therefore, we have focussed on anti-inflammatory molecules for screening. Libraries containing anti-inflammatory molecules (23,839) were selected for docking studies due to their relevance for CXCR4 function. Molecules with top 20 glide scores are shown in Table 1. The ADME score for all Top 20 compounds is given in Table 2. The ADME following top three compounds are given in Table 3. In table, the table heading, green colour indicates safe; orange colour indicates partially safe; red colour indicates risk. The 1-[(4-ETHYLPHENYL)METHYL]-4-[(3-NITROPHENYL)METHYL]PIPERAZINE, showed maximal glide score − 11.5 followed by 1-CYCLOHEXYL-4-[(2-NITROPHENYL)METHYL]PIPERAZINE which showed a glide score of -8.2 and 1-BENZYL-7-OXO-N-[2-(PYRROLIDIN-1-YL)ETHYL]AZEPANE-2-CARBOXAMIDE, showed a glide score of -7.0. The 1-[(4-ETHYLPHENYL)METHYL]-4-[(3-NITROPHENYL)METHYL]PIPERAZINE interacts with CXCR4 via TWO hydrogen bonds at GLU288 and TRP94 (Fig. 4a; 2D and 3D diagram), 1-CYCLOHEXYL-4-[(2-NITROPHENYL)METHYL]PIPERAZINE interacts with CXCR4 via same two hydrogen bonds at GLU288 and TRP94 (Fig. 4b), and 1-BENZYL-7-OXO-N-[2-(PYRROLIDIN-1-YL)ETHYL]AZEPANE-2-CARBOXAMIDE interacts with CXCR4 via three hydrogen bonds at GLU288, TRP94 and ASN101 (Fig. 4c). The Piperazine compounds also exhibit anti-tuburculor and anti-depressant activity [34, 35].
Table 1
Entry ID
|
PubChem CID
|
Name
|
Glide Score
|
4990
|
740944
|
1-[(4-ETHYLPHENYL)METHYL]-4-[(3-NITROPHENYL)METHYL]PIPERAZINE
|
-11.5
|
1597
|
46073602
|
1-[4-(2,4-DIFLUOROPHENOXY)-2-(4-METHYLPIPERAZIN-1-YL)-5H,6H,7H,8H-PYRIDO[4,3-D]PYRIMIDIN-6-YL]-2-(4-METHOXYPHENYL)ETHAN-1-ONE
|
-10.9
|
20574
|
136665827
|
7-[(3-CHLOROPHENYL)METHYL]-2-(4-METHYLPIPERAZIN-1-YL)-3H,4H,5H,6H,7H,8H,9H-PYRIMIDO[4,5-D]AZEPIN-4-ONE
|
-10.6
|
18401
|
3240862
|
1-[5-(3-CHLOROPHENYL)-1,2-OXAZOLE-3-CARBONYL]-4-METHYLPIPERAZINE
|
-9.2
|
22857
|
50801419
|
3-[2-(1-METHYL-1H-INDOL-3-YL)-2-(PYRIDIN-3-YL)ETHYL]-1-PHENYLUREA
|
-8.7
|
23537
|
50801367
|
3-(3-BROMOPHENYL)-1-[2-(1-METHYL-1H-INDOL-3-YL)-2-(PYRIDIN-3-YL)ETHYL]UREA
|
-8.6
|
2938
|
16764992
|
2-{[(2,1,3-BENZOTHIADIAZOL-5-YL)METHYL]SULFANYL}-4H-1,3-BENZOTHIAZIN-4-ONE
|
-8.2
|
4993
|
880196
|
1-CYCLOHEXYL-4-[(2-NITROPHENYL)METHYL]PIPERAZINE
|
-8.2
|
4661
|
695088
|
1-[(3-FLUOROPHENYL)METHYL]-4-[(4-NITROPHENYL)METHYL]PIPERAZINE
|
-7.8
|
21889
|
124063622
|
1-BENZYL-7-OXO-N-[2-(PYRROLIDIN-1-YL)ETHYL]AZEPANE-2-CARBOXAMIDE
|
-7.0
|
18456
|
2162624
|
9-ETHYL-3-({4-[(3-NITROPHENYL)METHYL]PIPERAZIN-1-YL}METHYL)-9H-CARBAZOLE
|
-6.7
|
4239
|
71686729
|
1-[(2-METHYLPHENYL)METHYL]-2-{[4-(4-METHYLPIPERAZINE-1-CARBONYL)PIPERIDIN-1-YL]METHYL}-1H-INDOLE
|
-6.6
|
1702
|
53137502
|
4-CYANO-N-[4-(MORPHOLIN-4-YL)QUINAZOLIN-7-YL]BENZAMIDE
|
-5.5
|
22407
|
87051447
|
6-METHYL-N-{[4-(METHYLSULFANYL)PHENYL]METHYL}IMIDAZO[1,2-A]PYRIDINE-2-CARBOXAMIDE
|
-5.0
|
5746
|
135511588
|
2-AMINO-5-(5-BROMO-2-FLUOROPHENYL)-3H,4H,5H,6H,7H,8H-PYRIDO[2,3-D]PYRIMIDINE-4,7-DIONE
|
-4.4
|
22192
|
135511596
|
2-AMINO-5-(2-ETHOXYPHENYL)-3H,4H,5H,6H,7H,8H-PYRIDO[2,3-D]PYRIMIDINE-4,7-DIONE
|
-4.4
|
5747
|
135511592
|
2-AMINO-5-(2-METHOXYPHENYL)-3H,4H,5H,6H,7H,8H-PYRIDO[2,3-D]PYRIMIDINE-4,7-DIONE
|
-4.3
|
18449
|
2738592
|
4-CHLORO-N-{[4-(4-METHYLPIPERAZIN-1-YL)PHENYL]METHYL}BENZENE-1-SULFONAMIDE
|
-3.9
|
7055
|
46257553
|
N-(4-ACETAMIDOPHENYL)-1-[(3-METHYL-2-OXO-2,3-DIHYDRO-1,3-BENZOXAZOL-6-YL)SULFONYL]PIPERIDINE-4-CARBOXAMIDE
|
-3.2
|
Table 2
ADME study of top 20 ligands (Based on Glide Score)Green colour indicates safe; orange colour indicates partially safe; red colour indicates risk
S.No.
|
Ligands
|
PubChem CID
|
Green
|
Orange
|
Red
|
Total
|
1
|
1597
|
46073602
|
15
|
4
|
4
|
23
|
2
|
1702
|
53137502
|
8
|
5
|
|
13
|
3
|
2938
|
16764992
|
10
|
1
|
12
|
23
|
4
|
4239
|
71686729
|
17
|
2
|
4
|
23
|
5
|
4661
|
695088
|
16
|
3
|
4
|
23
|
6
|
4990
|
740944
|
17
|
3
|
3
|
23
|
7
|
4993
|
880196
|
17
|
3
|
3
|
23
|
8
|
5746
|
135511588
|
15
|
1
|
7
|
23
|
9
|
5747
|
135511592
|
14
|
3
|
6
|
23
|
10
|
7055
|
46257553
|
17
|
3
|
3
|
23
|
11
|
18401
|
3240862
|
17
|
3
|
3
|
23
|
12
|
18449
|
2738592
|
15
|
5
|
3
|
23
|
13
|
18456
|
2162624
|
12
|
5
|
6
|
23
|
14
|
20574
|
136665827
|
19
|
0
|
4
|
23
|
15
|
21889
|
124063622
|
19
|
4
|
0
|
23
|
16
|
22191
|
135511596
|
14
|
2
|
7
|
23
|
17
|
22407
|
87051447
|
19
|
1
|
3
|
23
|
18
|
22857
|
50801419
|
16
|
2
|
5
|
23
|
19
|
23357
|
50801367
|
17
|
1
|
5
|
23
|
Table 3
Final top 3 ligands Based on ADME and docking study
S.No.
|
Ligands
|
PubChem CID
|
Green
|
Orange
|
Red
|
Total
|
1
|
4990
|
740944
|
17
|
3
|
3
|
23
|
2
|
4993
|
880196
|
17
|
3
|
3
|
23
|
3
|
21889
|
124063622
|
19
|
4
|
0
|
23
|
These interacting residue GLU288, plays very important role for fusion elicited by HIV-1 envelope glycoprotein [36]. All the three ligands interacting with these residues which exert their anti-HIV activities also. In all complexes, the ligands were interacting with the active site residue GLU288. In our study, we focused on the prevention of kidney injury. The inhibition of CXCR4 prevents kidney injury due to modulation of leukocyte infiltration and expression of proinflammatory chemokines/cytokines, rather than a HSC-mediated effect [37].
3.4. Simulation of native CXCR4 and best three complexes
Based on docking and molecular dynamics, EPI-X4 peptide was proposed as a promising inhibitor of CXCR4. The peptide was able to bind to D97 residue of CXCR4 [38]. The ID of 1-[(4-ETHYLPHENYL)METHYL]-4-[(3-NITROPHENYL)METHYL]PIPERAZINE was 4990. Similarly, the ID of 1-CYCLOHEXYL-4-[(2-NITROPHENYL)METHYL]PIPERAZINE was 4993. The ID of 1-BENZYL-7-OXO-N-[2-(PYRROLIDIN-1-YL)ETHYL]AZEPANE-2-CARBOXAMIDE was 21889. The RMSD plot (Fig. 5a) indicates the good stability of the compound 21889 as compared with the other two compounds. The RMSD of all conformations with step size of 10 ns were calculated and compared with 10 ns conformation (Supplementary Fig. 1). The average RMSD for the compounds 21889, 4993 and 4990 were found to be 0.7, 6.5 and 11.7 (Supplementary Table 1). This data strongly suggests the good stability of 21889 with CXCR4. The RMSF plot clearly suggest overall stability of all complexes except CXCR4-4993. The active site residues D97, D187 and E288 was not fluctuating in CXC94-4990 and CXC94-21889. The folding of CXCR4- 4990 and CXCR4-4993 was well maintained with time.The solvent accessible surface area got reduced with time. The highest number of H-bond was present in complex CXCR4_4990 followed by CXCR4_4993. The average potential energy of CXCR4 and all the three complexes is as follows: CXCR4: -1.98163e + 06, CXCR4_4990: -1.83261e + 06, CXCR4_4993: -1.85430e + 06 and CXCR4_21889: -1.847295e + 06. All these analyses strongly suggest the promising potential of 4990, 4993 and 21889 as inhibitor to CXCR4. People have targeted CXCR4 for COVID-19 therapies, they have reported few inhibitors but safety and stability of these molecules haven’t been investigated [39].
3.4. Principle Component Analysis (PCA) CXCR4 and three complexes
PCA helps us to find functionally relevant motions. During simulation, protein and protein-ligand complexes appear to be moving. It is difficult to discern between local fluctuations and collective motions because they both happen at the same time. In these situations, a principle components analysis can be useful because it can separate local, quick motions from global, collective (often sluggish) motions [40]. The trajectory (md1 backbone.xtc) and the structure (ref.pdb) file were used for PCA analysis.
In Fig. 6 the projection on eigenvector 1 and projection on eigenvector 2 represents the largest amplitude collective motions. It shows PCA for native protein and complexes. The global motion in native protein was less as compared to complexes. For complexes CXCR4-4990 and CXCR4-4993, the projection on eigenvector 1 and projection on eigenvector 2 is mostly positive which clearly indicates more motions after binding. The complex CXCR4-21889 attained positive and negative projection equally in both vectors. It shows motion more than native protein but less than other two complexes.
3.5 Per residue decomposition analysis of three complexes
The energy function in molecular mechanics is a pair-wise additive function with per-residue components. This is important for determining how much each protein residue contributes to ligand binding [41]. From this analysis, we reported that in Total Decomposition contribution (Fig. 7); TRP94, GLU288, TYR45 and ASP97 are the important residues contributing to 4990, 4993 and 21889 binding. Similarly, in Side Chain Decomposition contribution (Fig. 7) same residues plays an important role in ligand binding.
3.6 MM-GBSA of three complexes
The MMGBSA of the three complexes were done to find the ligand binding affinities [42]. In Table 4, The ligand binding affinity of 4990 was better than other two ligands.
Table 4: MM-GBSA of three complexes
Complexes
|
r_psp_MMGBSA_
dG_Bind
|
r_psp_MMGBSA_
dG_Bind_Coulomb
|
r_psp_MMGBSA_
dG_Bind_Covalent
|
r_psp_MMGBSA_
dG_Bind_Hbond
|
r_psp_Prime_
MMGBSA_ligand_efficiency_ln
|
ligprep_1.maegz:4990
|
-36.1190422432
|
-78.0801184995
|
3.0621897673
|
-0.4301900774
|
-8.561295412
|
ligprep_1.maegz:4993
|
-34.5109171155
|
-78.897123741
|
2.6722354822
|
-0.9667112506
|
-8.4357269593
|
ligprep_1.maegz:21889
|
-17.9352366805
|
11.5013438204
|
-0.1972390855
|
-0.0282814165
|
-4.2511885689
|