Computational investigation of adenosine 5′-(α,β-methylene)-diphosphate (AMPCP) derivatives as ecto-5′-nucleotidase (CD73) inhibitors by using 3D-QSAR, molecular docking, and molecular dynamics simulations

CD73, as a surface enzyme anchored on the outside of the cell membrane via glycosylphosphatidylinositol (GPI), can convert the AMP in the tumor cell microenvironment into adenosine to promote the growth of tumor cells. It has been overexpressed in many different types of human tumors, such as gastric cancer, pancreatic cancer, liver cancer, and other tumor cells. Therefore, targeted inhibitors of CD73 are considered potential tumor treatment methods. Due to the low bioavailability of nucleoside CD73 inhibitors, it is necessary to develop new inhibitors. In this study, through molecular docking, three-dimensional quantitative structure–activity relationship (3D-QSAR) and molecular dynamics (MD) simulations, a series of CD73 inhibitors were calculated and studied to reveal their structure–activity relationships. Through molecular docking studies, the possible mode of interaction between inhibitors and protein is explored. Subsequently, a 3D-QSAR model was established by comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA). For the best CoMFA model, the Q2 and R2 values are 0.708 and 0.983, respectively, while for the best CoMSIA model, the Q2 and R2 values are 0.809 and 0.992, respectively. Based on the contour maps, we designed ten new CD73 inhibitors and predicted their activity by the model, all of them are better than molecules in the dataset. In addition, in order to select potential drug candidates, ADMET prediction was performed on template molecules and designed compounds. Moreover, the stability of the complex formed by the two inhibitors and CD73 was evaluated by molecular dynamics simulation, and the results are consistent with the results of molecular docking and 3D-QSAR research. Finally, the binding free energy was calculated by the surface area method (MM-GBSA), and the results are consistent with the activities that van der Waals and Coulomb contribute the most during the binding process of the molecule to the CD73 protein. In conclusion, our research provides valuable information for the further development of CD73 inhibitors.


Introduction
As a new method, immune checkpoint inhibitors (ICIs) have caused a revolutionary cancer treatment [1]. However, the drug resistance of tumor cells to ICIs is a problem that cannot be ignored. Therefore, developing a new immune checkpoint inhibitor is very important for immunotherapy [2]. CD39 is a cell surface enzyme, which is highly expressed in tumor cells, exhausted T cells and multi-inhibitory cells [3,4]. It is a key enzyme for the hydrolysis of ATP and ADP into AMP [5]. CD73 is a surface enzyme anchored outside the cell membrane by glycosylphosphatidylinositol (GPI); it can convert AMP into adenosine and expressed in cancer cells, DC cells, Treg (regulatory T cells), NK cells [6,7]. Studies have found that CD73 is overexpressed in gastric cancer, pancreatic cancer, rectal cancer, and liver cancer cells [8][9][10][11]. The sequential expression of CD39/CD73 pathway can eliminate extracellular ATP in the tumor microenvironment (TME) to generate immunosuppressive adenosine and promote tumor growth [12]. In addition, CD73 also acts as a signal and adhesion molecule, which can regulate the interaction between cells and extracellular matrix components (such as laminin and fibronectin) to mediate the invasion and metastasis characteristics of cancer [13]. Therefore, targeted therapy that inhibits CD73 is expected to be a new immunotherapy method.
Structurally, CD73 presents an N-terminal domain and a C-terminal domain connected by a long α-helix. The rotation and bending of these two domains provide the open and closed transition of the conformation to bind to the substrate and released [14]. In recent years, based on the structure of adenosine, a series of competitive nucleotide derivative inhibitors have been designed, the most effective of them is AOPCP, which is also one of ideas to design more effective inhibitors [15]. Although these compounds show potential inhibition to CD73, they are negatively charged under physiological pH because they contain phosphorus atoms so that their bioavailability is very low [16]. In fact, stability studies in rat liver microsomes showed that AOPCP was completely degraded within 10 min, while the compound based on it, PSB-12489, degraded 5% within 8 h [17,18]. After that, the AB680 was designed by changing the purine ring to pyrazolopyridine ring, which has a good tolerance and has entered the phase I of clinical trials [19]. However, recent studies have found that AB680 competes with AMP in vivo, which is weaker than AMP, indicating that binding to CD73 is reversible and slow, so that AB680 has a direct implication for dosage [20]. Therefore, it is necessary to develop new CD73 inhibitors to change their bioavailability.
This study used computer-aided drug design methods to study a series of compounds with adenosine 5′-(α,βmethylene)-diphosphate derivatives, and these compounds have been tested for CD73 inhibitory activity [21]. In order to study the relationship between structure and activity, based on these 60 CD73 inhibitors, a three-dimensional quantitative structure-activity relationship (3D-QSAR) model with predictive ability was constructed. Through the analysis of the model, the structure characteristics that affect its activity were found. Secondly, through molecular docking studies, the bioactive conformation of the inhibitor was calculated and the receptor-ligand interaction between the compound and the protein was analyzed. Moreover, ADMET properties were performed on template molecules and new designed compounds to select potential drug candidates. In addition, through molecular dynamics (MD) simulations and binding free energy calculations, more realistic conformational changes were analyzed, and the understanding of the binding energy and mode of interaction of inhibitors and CD73 proteins was deeper. In summary, this study provides a basis for clarifying the structure-activity relationship of such CD73 inhibitors and the mode of interaction with proteins, and for designing more active CD73 inhibitors.

Dataset preparation
In this study, a total of 60 adenosine 5′-(α,β-methylene)diphosphate derivatives which have the same skeleton were selected from the literature; the chemical structures of the compounds with their respective activity values are listed in Table 2 [21]. The IC 50 active values distributed in a gradient of 0.027-18,000 nM were converted into the corresponding pIC 50 (−logIC 50 ) values to develop the 3D-QSAR models. The 3D structures in the dataset were constructed and energy optimization was carried out using SYBYL-X 2.1.1 by tripos force field [22] with the gradient convergence criterion and max interaction of 0.005 kcal/mol and 10,000 respectively and Gasteiger-Hükel charges [23]. The compounds in dataset were divided randomly into a training set of 42 compounds (70%) and a test set of 18 compounds (30%) for the generation of the 3D-QSAR model [24].

Molecular alignment
Molecular alignment plays a key role in 3D-QSAR study [25]. Herein, two alignment methods including ligand-based alignment and docking-based alignment methods were employed to construct the optimal 3D-QSAR model [26,27]. For docking-based alignment, the conformation of each molecule in the dataset obtained from molecular docking was selected with the best docking score for alignment. For ligand-based alignment, generally, low energy conformation of the highest bioactive molecule was set as template. Thus, compound 45 as a template molecule for the rest of molecules to align on, the public skeleton was marked in blue bold part (Fig. 1a). In the process of alignment, the eXtended Electron Distribution (XED) force field was used to calculate the field points while other parameters were set by default [28]. A very accurate and slow calculation method was selected in the process of conformational search to improve the alignment effect.

Construction of CoMFA and CoMSIA models
CoMFA and CoMSIA models of CD73 inhibitors were founded by SYBYL 6.9.1 software to describe the relationship between the biological activity and 3D structure of CD73 inhibitor compounds [29]. The fields of CoMFA model were calculated including steric (S) and electrostatic (E) [30]; meanwhile, the CoMSIA model with the above two fields also calculated its hydrophobic (H), hydrogen bond donor (HBD), and hydrogen bond acceptor (HBA) [31]. Other parameters were set in default.

Partial least squares (PLS) analysis
Evaluation of the relationship between structure and biological activity of compounds by PLS method can quantitate the reliability of the model [32]. The analysis was divided into two parts; the training set database was crossvalidated by using the leave-one-out method to calculate the cross-validation correlation coefficient (Q 2 ) and the optimum number of components (ONC) [33]. Then, the non-crossvalidation correlation coefficient (R 2 ), Fisher's test value (F value), and the standard error of estimate (SEE) were calculated through non-cross-validation analysis [34]. Among them, high score (Q 2 > 0.5, R 2 > 0.9, F > 100, SEE < 0.95) models were considered to be predictive [35,36].
What's more, whether the models were accurate, reliable, and capable of prediction, the following formula was used to calculate the R 2 pred value to evaluate CoMFA and CoMSIA.
In the above formula, SD represents the sum of the square deviation between the biological activity of the test set and the average activity of the training set, while PRESS represents the sum of the square deviation between the experimental value and the predicted value of the training set [37]. After CoMFA and CoMFA models were developed, contour maps of the field type "StDev*Coeff" represented graphically the results [38].

Molecular docking study
Molecular docking is a theoretical method to effectively verify and simulate the binding effect of drugs and target proteins through the geometric relationship between small molecular ligands and proteins, energy matching, and recognition [39]. In this study, Glide module of Schrödinger suite software was applied to molecular docking and the X-ray crystal structure (PDB code: 6Z9B) of CD73 was obtained from the Protein Data Bank (http:// www. rcsb. org/) with resolution of 2.17 Å. Then, the docking target protein must be preprocessed; the crystal structure was prepared by Protein Preparation Wizard in Maestro v9.7 to assign bond orders, add hydrogen atoms, and build the missing side chain atoms [40]. In order to get a more convincing molecular docking score, Glide was used to generate a grid based on the co-crystal ligand and the grid size was 10 × 10 × 10 Å. Then, the extra precision (XP) algorithm in Glide was used to complete the molecular docking, and other parameters as default values to further analyze the docking results were used [41].

ADMET property prediction
ADMET (drug absorption, distribution, metabolism, excretion, and toxicity) pharmacokinetic method is a very important method in contemporary drug design and drug screening. The ADMET of a compound ultimately depends on its physical, chemical, and biological properties, which can determine whether the compound can be a potential drug candidate [42]. The most relevant indicators of these physical and chemical properties are lipophilicity, hydrogen bonding, solubility, permeability, etc. [43], so the above parameters will be evaluated. We predicted the ADME properties of the template compounds 45 and 55 and the novel designed compounds by using the QikProp software (Schrodinger, Portland, OR, USA, 2017, http:// wwwsc hrodi nger. com/ QikPr op/). Furthermore, the toxicity risk of the compounds was assessed using the OSIRIS DataWarrior software v.5.5.0 (http:// www. openm olecu les. org/ dataw arrior/).

Molecular dynamics simulation
Molecular dynamics simulation can track the conformational movement of target proteins and other molecules on the atomic level [44]. Therefore, the Desmond (v3.8) module developed by Schrӧdinger software in 2018 was used for molecular dynamics simulation research to further study the specific interaction between molecules and proteins in the binding process [45]. The two complexes formed by the highly active compound 55 and the lower active compound 5 with CD73 were selected for molecular dynamics simulation.
In the process of molecular dynamics simulation, an orthogonal box filled with simple point charge (SPC) water molecules was first created to embed into these complexes [46]. Afterward, in order to reach the physiological salt concentration, a necessary amount of counter ion must be added to neutralize the system. Then, the energy was minimized under the action of the OPLS-2005 force field, the complex system was relaxed to local energy minimization, the maximum number of iterations was set to 5000, and the convergence threshold was set to 1.0 kcal/mol/Å [47].
After energy minimization, the relaxed system was submitted to a simulation time of 100 ns with a time step of 2 fs, using a Nose-Hoover thermostat at 300 K and using Martyna-Tobias-Klein Barostats at 1.01325 bar to simulate the NPT ensemble [48]. In the simulation process, the system energy was calculated every 1.2 ps and the trajectory was recorded every 4.8 ps. Finally, in order to determine the stability of the two complexes, a series of data of the two complexes were analyzed, including root mean square deviation (RMSD), root mean square fluctuation (RMSF), protein-ligand interaction, and potential energy (U) [49].

Binding free energy calculation
The molecular mechanics generalized Born surface area (MM-GBSA) method combines molecular mechanics energy, non-polar solvation conditions, solvation model, non-polar solvent accessible surface area, and van der Waals interaction to calculate protein-ligand complex binding [50].
The compound 55 with higher activity and compound 5 with lower activity were selected to form two complexes which have large difference in biological activity with CD73, and their binding free energy was calculated by Prime using molecular mechanics generalized Born surface area (MM-GBSA), which was embedded into Schrӧdinger software, and three formulas were involved in the calculation [51]: In the above formula, ∆G bind is the binding energy. The G complex , G complex , and G ligand are the corresponding free energy. ∆E MM represents the minimized energy differences between the complex structures of protein-ligand and the sum of the energies of the ligand and protein [52]. TΔS was obtained based on the conformational entropy change caused by ligand binding. The electrostatic solvation energy ∆G PB (polar contribution), non-electrostatic solvation energy ∆G SA , and ΔEDISPER were added together to get the free energy of solvation ΔGsol. ΔE ele represents electrostatic energy and ΔE vdw represents van der Waals energy.

Molecular docking study
Molecular docking is a computational tool in which receptors and ligands recognize each other to form molecular complexes through energy matching, spatial matching, and chemical property matching, and predict the structure of the complex. To evaluate the reliability of the XP algorithm in Glide, the co-crystal ligand was redocked into the crystal structure of CD73 (PDB code: 6Z9B). The root mean square deviation (RMSD) between the native conformation and the redocked conformation was 0.483 Å, which indicates that the XP algorithm in Glide was reliable, and it can be considered that this algorithm can be used for subsequent molecular docking studies. As shown in Fig. 1b, it can be seen that the native conformation and the redocked conformation of the natural ligand overlap ΔG bind = ΔE MM − TΔS + ΔG sol ΔG bind = ΔE vdW − ΔE ele + ΔG GB + ΔG SA + TΔS each other well. Therefore, all inhibitors in the dataset were docked into the active pocket of CD73, and the docking scores are shown in Table S1.
The compounds in our dataset were derived from four types, including partial compounds obtained by substituting AMPCP as the skeleton and three skeletons obtained by substituting its adenine with different nitrogen heterocycles such as pyrazolopyrimidine, pyrrolopyrimidine, and pyrazolopyridine. By analyzing the structural information of the compounds, it was found that they have similar interactions with protein.
For compounds with AMPCP as the skeleton, compound 5 was taken as an example to analyze the binding mode of this series of compounds and proteins. As we can see from Fig. 2a, it was found that several hydrogen bonds were formed. It can be seen that most of the hydrogen bonds were formed by two phosphate groups, among which HIS38, ARG395, and ARG354 formed hydrogen bonds with oxygen atoms, and ASN117 and ASN245 formed hydrogen bonds with oxygen anions. The existence of these hydrogen bonds ensured the stability of the interaction between the compound and the protein. At the same time, HIS118, ARG395, and ARG354 all formed a salt bridge with the negative oxygen atom on the phosphoric acid group. It can also be seen that the hydroxyl group on the tetrahydrofuran ring formed a hydrogen bond with ASP506, and ASN390 formed a hydrogen bond with the N atom on the pyrimidine ring. In addition, there were hydrophobic effects on the imidazopyrimidine ring; residues PHE500 and PHE417 work with the two nitrogen heterocycles to form pi-pi stacking interaction.
For pyrazolopyridine compounds, compound 55 was taken as an example, as shown in Fig. 2c, d. ASP36, ARG395, HIP118, ASN117, ARG354, and ASN245 formed hydrogen bonds with phosphate groups and ASP506 with hydroxyl groups on tetrahydrofuran. These hydrogen bonds increased the stability of binding to proteins. In addition, two salt bridges are formed with ARG354 and ARG395 through the oxygen atom of the phosphate group. PHE500 and PHE417 formed pi-pi stacking interaction with pyrazole ring and pyridine ring to produce hydrophobic effect.
As for pyrrolopyrimidine and pyrazolopyrimidine compounds, the effective inhibitors (compounds 54 and 56) are shown in Fig. 2e-h. The interaction between compound 54 and protein was similar to that of compound 55, except that the hydrogen bond formed by compound 54 and ASP36 was changed to combine with HIP118. In addition, the salt bridge formed by oxygen atom and HIP38 was added. On the substituent of R1, LEU formed a new hydrogen bond with the N atom, and the benzene ring formed a pi-pi stacking interaction with PHE500. The compound 56 lacked the hydrogen bond of ASN245 on the basis of 54. In addition, it was found that the nitrogen on the pyrazole ring produces hydrogen bond with ASN390, and the rest of the forces were the same.
From the above interaction analysis, it can be seen that these four kinds of compounds with different frameworks had some interaction in common, and their different biological activities and structural characteristics provided data for the 3D-QSAR research. However, it has been reported that compounds with ribose analogs are more potentially active than compounds with β-fluoroarabinose, and the substitution of a diaza heterocycle for the adenine group may have better activity. Therefore, compounds with pyrazolopyrimidines as the core had better inhibitory activity against CD73 and were selected as objects for further research.

CoMFA and CoMSIA statistical results
The accuracy of the model was directly related to the alignment methods, so ligand-based methods and docking-based were used to construct models respectively. Both methods were analyzed by PLS, and the following parameters were used to evaluate the quality of the model, including Q 2 , ONC, SEE, R 2 , and F value. The calculated results are showed in Table 1 and the alignment results are represented in Fig. 3. After analyzing the results, it can be found that the model parameters based on the docking were not within the acceptable range while the parameters of ligand-based model all meet the study requirements which indicated a high model quality; therefore, the model constructed by ligand-based method was applied to the following 3D-QSAR study. The detailed data of CoMFA model were Q 2 of 0.708, ONC of 8, SEE of 0.170, R 2 of 0.983, and F value of 241.755; the contributions of steric field and electrostatic field were 78.8% and 21.2%, respectively, which meant that the steric field played a crucial role in the CoMFA model. As for CoMSIA model, the Q 2 of 0.809, ONC of 10, SEE of 0.121, R 2 of 0.992, and F value 383.332 and the contributions of H-bond acceptor, hydrophobic, steric, and electrostatic fields were 19.2%, 36.3%, 16.2%, and 28.4% respectively, indicating that the hydrophobic and electrostatic fields were the main contribution in CoMSIA model.
The reliability of the CoMFA and CoMSIA models was verified with 18 molecules in the test set, and the statistical parameters R 2 pred were 0.987 and 0.991, respectively. Table 2 lists the experimental activities and predicted activities generated by the 3D-QSAR model, and the correlations between them are shown in Fig. 4. It can be observed from the graph that there was a good linear relationship between experimental activities and predicted activities. The above data showed that a reliable and high-quality three-dimensional quantitative structure-activity relationship (3D-QSAR) model has been established.

CoMFA and CoMSIA contour maps analysis
The contour map of the QSAR model helped to visually illustrate the relationship between structural features and biological activity. This relationship used the field type stdev*coeff field to determine the results of CoMFA and CoMSIA, in which the contributions of all favorable and unfavorable regions were set as default parameters. In this study, high-activity compound 55 (IC 50 = 0.041 nM) and low-activity compound 5 (IC 50 = 18,000 nM) were selected as representative inhibitors to explain the structure-activity relationship of these inhibitors.
The CoMFA steric contour maps of compounds 55 and 5 are shown in Fig. 5a, b. The green contour lines indicated that bulky groups in the region were conducive to increased activity, while yellow was the opposite. It can be seen that the green part was mainly distributed near the pyrimidine group in compound 5, indicating that the substitution of large groups in this region will increase the activity of the compound, which explained why the activity of compounds 15 and 49 whose hydrogen atoms on pyrimidines are replaced by cyclohexane was better than that of compound 5; the high activity of compound 55 may be due to the large benzene ring substitution in this part. At the same time, it can also be observed that the chlorine atom on the pyrimidine group of compound 5 was covered by the yellow part, indicating that this position can be replaced by small groups as much as possible to have better activity, which can explain why the activity of compounds 40 and 44 with substitution of benzene ring and epoxy ring was not satisfactory.
The CoMFA electrostatic profiles of compounds 5 and 55 are shown in Fig. 5c, d. The red outline indicates where the negatively charged group can enhance the biological activity, and the blue outline indicated that the positively charged substituent can increase the biological activity. It can be seen that the blue part was mainly distributed in the adenine part of inhibitors 5 and 55, because this part contains multiple N atoms which were negatively charged, so positively charged groups can be considered for substitution on the R1 substituent.
The assumption of the CoMSIA method was similar to that of the CoMFA method, the non-covalent interaction between the ligand and the receptor was determined by the molecular field. The calculation of the molecular field used five different similarity field calculations: steric field, electrostatic field, hydrogen bond field (including hydrogen   The white outlines indicated that the introduction of hydrophilic substituents in these portions can increase activity, while the yellow outlines indicate areas that were favorable       for hydrophobicity. The R 1 on the pyrimidinyl group of the compound 5 was surrounded by a white part, indicating that if this site is substituted by a hydrophilic group, the activity of the compound will increase. At the same time, there were a lot of white parts around the imidazole ring, indicating that this part should have a hydrophilic group. In addition, R 2 was surrounded by a yellow outline, indicating that a hydrophobic substituent was required.
In the HBA region shown in Fig. 5g, h, the blue-green contour lines indicated increased activity of the hydrogen bond acceptor, while the orange contour lines were the opposite. The blue-green region was close to the R 2 and the imidazolopyrimidine ring, indicating that the introduction of HBA chemical substituents at these positions can help improve biological activity.
As shown in Fig. 6, the structural characteristics of CD73 inhibitors to increase the activity were summarized from the CoMFA and CoMSIA models. It can be seen from the structural activity model that we have established that the imidazopyrimidine ring on the molecular skeleton mainly played a hydrophobic role and it needed hydrogen bond receptor. The R 1 part has larger groups, hydrophilic and positively charged groups will have greater activity, R 2 if there were small groups, hydrophobic receptor groups will enhance the activity, if R 3 are hydrophilic groups existence is crucial, and this should be preserved during structural modification.

Design for new CD73 inhibitors
Based on the results of 3D-QSAR and molecular docking, 10 new CD73 inhibitors were designed on the basis of the template molecule 55 and their pIC 50 were predicted using the constructed COMFA model. The chemical structure and predicted activity of ten designed compounds are shown in Table S2. The green part around the R1 substituent of the nitrogen atom indicates that the substitution of a large group is conducive to improving the activity, so the compounds D1-D3 were designed and their activity were better than 55. Considering that when R 2 was a small hydrogen bond acceptor group, it was more conducive to the improvement of activity, so the compounds D4-D5 were designed. The hydrophilic group as R3 substituent successfully designed compounds D6-D8. The R4 design was replaced by a nitrogen-containing heterocyclic ring with more hydrogen bond acceptors to obtain D9-D10; their activity also better than 55.

ADMET property prediction
To further explore the drug-likeness of the designed compounds, we predict their ADMET property to evaluate whether they can be potential drug candidates. The results obtained by predicting the ADMET properties of each compound are shown in Table 3, and the parameters in the unacceptable range are presented in bold. It can be seen that the log K hsa and log BB of most compounds are in the unacceptable range, indicating that it is difficult for these drugs to transport to various tissues and penetrate the nervous system. All compounds have poor Apparent MDCK Permeability and Human Oral Absorption, indicating that these drugs cannot easily pass through the cell monolayer and are not suitable for oral administration. Happily, all the compounds have no toxicity including mutagenicity, tumorigenicity, irritating effect, and reproductive effects, except D3 has low tumorigenicity, and they all of them have good CLP and drug-likeness.

Molecular dynamics simulation
In order to illustrate the stability of the conjugates of compounds 5 and 55 with CD73 and the contribution of key residues in the active pocket, a 100-ns molecular dynamics simulation was used to evaluate the complex. As shown in Fig. S2, 19.1% of the residues are distributed in the α-helix, and 20.2% of the residues were distributed in the β-strand, which ensure the stability of the protein and the low fluctuation of the RMSD value.
For the complex of CD73 and compound 5, the RMSD value of the protein backbone rose to about 2.5 Å in the first 20 ns, and kept fluctuating. Until the end of the 60-ns Fig. 6 The crucial chemical features involved in the activity of CD73 inhibitors obtained from the 3D-QSAR study Fig. 5 The CoMFA steric contour maps for compound 5 (a) and compound 55 (b). Green color and yellow color represent favorable and unfavorable regions for the steric field. The CoMFA electrostatic field contour maps for compound 5 (c) and compound 55 (d). The red color shows the favored negative electrostatic region, and the blue color shows the favored positive electrostatic region. The CoMSIA hydrophobic contour maps for compound 5 (e) and compound 55 (f). The yellow color shows the favored hydrophobic region, while the white color shows the disfavored hydrophobic region. The CoMSIA hydrogen bond acceptor contour maps for compound 5 (g) and compound 55 (h). The green-blue color represents the favored hydrogen bond acceptor region, while the orange color represents the disfavored hydrogen bond acceptor region Table 3 ADMET property of the template compounds and the novel compounds simulation, the RMSD value stabilized at around 2.2 Å (Fig. 7a). In addition, during the entire simulation process, the RMSD value of the ligand configuration fluctuated frequently in the first 20 ns to 1.5 Å and reached equilibrium at about 80 ns (Fig. S3). These results indicated that the complex reached a steady state at the end of the molecular dynamics simulation. As for the complex with compound 55, the RMSD value of the protein backbone rose to close to 2.0 Å within 30 ns, and stabilized to 1.7 Å at 80 ns (Fig. 7a). During the whole process, the RMSD value of compound 55 did not change much, and at the same time, the RMSD value of its ligand kept fluctuating around 1.5 Å, indicating good stability during the simulation process.
The changes in the flexibility of the protein were detected by averaging the atoms of all the residues, and the root mean square fluctuation (RMSF) value of the main chain residues was calculated. The graphs of the two complexes are shown in Fig. 7b. In RMSF, most of the residues fluctuated within the range of 2.5 Å, and very few residues exceeded 3.0 Å. In addition, the overall change trend of the amino acid residues of the two compounds was similar, and the changes of some key residues were even completely the same, indicating the two complexes had similar binding mode.
In the MD simulation process, through the monitored receptor-ligand binding mode, a histogram based on the duration and docking results was obtained. The diagram shows a variety of interactions including hydrogen bonds, hydrophobicity, ions, and water bridges.
For compound 5 with lower activity (Fig. 8a), because the phosphate group contains multiple oxygen atoms, it can form hydrogen bonds with multiple residues, including hydrogen bonds with ARG395, HIS38, HIS243, HIS118, and ARG354, respectively, accounting for 113%, 65%, 96%, 99%, 97%, 93%, and 88%. In addition, the hydroxyl groups on the tetrahydrofuran ring also formed hydrogen bonds with ASP506, accounting for 36%. It can be seen that hydrogen bonds played a vital role in the interaction between the ligand and the receptor.
For compound 55 (Fig. 8b), similar to compound 5, most of the hydrogen bonds were formed with the oxygen atom of the phosphoric acid group, such as HIS243, ASN117, HIS118, HIS38, ARG354, and ARG395, and account for 45%, 33%, 81%, 73%, 100%, 98%, 57%, 43%, and 91% respectively. In addition, it was different from compound 5 where ASP506 formed a hydrogen bond with the two hydroxyl groups of tetrahydrofuran in a proportion of 75% and 77%. Meanwhile, ASN390 also formed a hydrogen bond with a hydroxyl groups in tetrahydrofuran in a ratio of 86% and formed a hydrogen bond with the N atom of the pyridine ring in a ratio of 94%. In addition, due to the hydrophobicity of the pyrazole ring, PHE417 formed pi-pi stacking interaction with it and accounting for 71%.
During the entire molecular dynamic simulation process, the rotatable bond in the two molecules was also investigated. The ligand torsion diagram shows the conformational changes of each rotatable bond (RB) of the ligand in the entire simulation trajectory (0.00 to 100.00 ns). It can be seen that compound 5 had 8 rotatable bonds and compound 55 has 10 rotatable bonds (Figs. S4 and S5). This was because compound 55 has a larger group substitution on R 1 but compound 5 does not. These bonds increased the possibility of binding between the active site and residues by changing the spatial position, thereby increasing the activity. In contrast, compound 5 had a small number of rotatable bonds, which hindered its good interaction with proteins, and also the reason for its reduced biological activity.

Binding free energy calculation
The MM-GBSA algorithm was used to calculate the binding free energy of the complex formed by CD73 with the higher activity inhibitor 55 and the lower activity compound 5. The statics and histograms are shown in Table 4 and Fig. S6. It can be seen from the results that the ΔG Bind value of compound 55 (−69.82 kcal/mol) was significantly lower than that of compound 5 (−48.49 kcal/mol), which was consistent with the difference in IC 50 value.
The binding free energy was consisted of several types of interactions, among which Coulomb and van der Waals contribute the most to the total free energy of binding of the two compounds. In addition, the values of these two forces of  compound 55 were lower than those of compound 5 because its binding site forms additional interactions with surrounding residues. The ΔG Bind_covalent of inhibitors 5 and 55 were 6.89 kcal/mol and 9.25 kcal/mol respectively, both of them were positive, indicating that covalent contact hardly contributed to the binding free energy. The ΔG Hbond values of compounds 5 and 55 were −11.08 kcal/mol and −12.21 kcal/ mol, respectively, because the numbers of hydrogen bonds formed with compound 55 and the hydrogen bonds formed with ASN390 were more stable than those of compound 5.

Conclusion
In this paper, a series of CD73 inhibitors (5′-(α,βmethylene)-diphosphate) with the same skeleton were studied by 3D-QSAR, molecular docking, molecular dynamics simulation, and binding free energy calculation. Molecular docking studies found that 60 CD73 inhibitors interact with related proteins, revealing several key residues, including ARG395, ARG354, ASN117, ASN245, HIP119, and ASP506. The CoMFA and CoMSIA models were calculated based on the ligand-based molecular alignment with high reliability. In the 3D-QSAR study, the Q 2 and R 2 of the CoMFA model were 0.708 and 0.983, respectively, and the Q 2 and R 2 of the CoMSIA model were 0.809 and 0.992 respectively. In addition, the contour maps can explain the relationship between the chemical structure of the compound and the biological activity, which provided a certain degree of hints and guidance for the design of new molecules. Furthermore, the ADMET prediction indicated that the designed compounds have no toxicity but poor permeability. Besides, molecular dynamics simulations were used to analyze the stability of the two docking results. Through the exploration of the entire simulation process, important residues that contributed to the docking results were found. Subsequently, the binding free energy of the two molecules was calculated using the MM-GBSA method, and the results showed that van der Waals and Coulomb energy made major contributions to these two molecules. Using a variety of calculation methods to clarify the mode of action and chemical properties of CD73 inhibitors, which had guiding significance for the further design of high-efficiency CD73 inhibitors with high anti-cancer activity.