2.1 Virtual screening
The crystal structure of SHP2 was obtained from the Protein Data Bank with the accession code 5EHR, of which the resolution is 1.7 Å. The coactivators, as well as all water molecules, were removed from the structure. Hydrogen atoms and charges were added using the Protein Preparation Wizard module in Maestro (Schrödinger Inc, version 10.1). The crystal structure, after optimization of the hydrogen bond network, was subjected to energy minimization using the OPLS3 force field. The grid-enclosing box was placed on the center of the crystallographic ligand and defined to encompass residues within a 15 Å radius around the ligand, constituting the binding pocket. The remaining settings were kept at their default values. Under conditions of pH ranging from 5.0 to 9.0, the addition of hydrogen atoms and ionization facilitated the generation of stereoisomers and 3D conformers for compounds housed within the ChemDiv database (∼2,990,000 compounds), employing the Ligprep v3.3 module. Utilizing both the Glide standard precision (SP) and extra precision (XP) methods18, 19, molecular docking was performed with default parameters, retaining only the highest-ranked pose for each molecule. After Glide SP scorings, the top 15,215 ranked molecules were retained for subsequent calculations using the more precise Glide XP scoring function. Finally, the top-ranked 1,520 molecules were retained for further visual observation. Based on the Tanimoto similarity metric, a hierarchical clustering method was employed to cluster the 1,520 MOLPRINT2D fingerprint features into 25 groups. Taking into consideration ligand binding poses, structural diversity, and novelty, 25 candidates were selected from the pool of 1,520 compounds and subsequently subjected to activity testing.
2.2 Fragment molecular orbital (FMO) calculation
Structure preparation. Prior to the FMO calculation, the complex structures of the ligands and SHP2 were prepared by the Molecular Operating Environment (MOE, version 2014.09). All binding modes of SHP2-ligand complexes in this report were derived from crystallography and molecular docking simulations. The ligand and residues around 4.5 Å of the ligand were cut off into the entire system researched. Hydrogen atoms were added to the complex structure and ionization states of the acidic and basic amino acid residues were assigned with the Protonate 3D tool implemented in MOE applied by default settings (pH = 7.0, a solvent dielectric constant of 80). Then, the structures were performed energy minimization using the semiempirical Amber10: EHT forced field. During this process, a constraint was applied, restricting each atom from deviating no more than 0.5 Å from its original position, a parameter enforced by the MOE program. We fragmented the entire optimized system using Facio (version 19.1.5) to prepare the FMO input files.
FMO method and inter-fragment interaction energy (IFIE). The FMO method was employed to partition a complex biological system, such as a protein, into smaller parts referred to as fragments, typically corresponding to individual residues. This approach saves a large amount of computation time compared to the traditional quantum mechanical (QM) methods20, 21. Generally, the FMO calculation contains the following steps: (1) partition a large system into multiple smaller fragments, (2) iteratively perform Ab initio calculations on each individual fragment (referred to as monomers) within the context of an embedding polarizable potential generated by the surrounding fragments until self-consistency is achieved in the electron densities, (3) calculate self-consistent field (SCF) of fragment pair, introducing interfragment charge transfer along with other quantum effects into the calculations, (4) calculate the total energy of the system.
In our calculation, all FMO calculations were performed using General Atomic and Molecular Electronic Structure System (GAMESS, version Sep, 2020) 22, 23 and were performed by second-order Møller-Plesset perturbation (MP2) methods with the 6-31G* basis set (FMO-MP2/6-31G*). We treated the amino group as neutralized residue to avoid overestimating charge-charge interactions due to calculations in vacuum conditions. The FMO-based IFIE between fragment i and j, \(\Delta E_{{ij}}^{{\operatorname{int} }}\), is given by
\(\Delta E_{{ij}}^{{\operatorname{int} }}=\Delta E_{{ij}}^{{{\text{ES}}}}+\Delta E_{{ij}}^{{{\text{EX}}}}+\Delta E_{{ij}}^{{{\text{CT+MIX}}}}+\Delta E_{{ij}}^{{{\text{DI}}}}\)
Where \(\Delta E_{{ij}}^{{{\text{ES}}}}\),\(\Delta E_{{ij}}^{{{\text{EX}}}}\),\(\Delta E_{{ij}}^{{{\text{CT+MIX}}}}\)and \(\Delta E_{{ij}}^{{{\text{DI}}}}\)represent the electrostatic, exchange repulsion, charge transfer with higher-order mixed and dispersion terms, obtained by the pair interaction energies decomposition analysis (PIEDA)24.
2.3 Molecular dynamics simulation
In order to investigate the stability and behavior of the 7188-0011-protein complex, molecular dynamics (MD) was performed using AmberTools18 simulation package (University of California, San Francisco, CA, USA ) for 100 ns. The parameters file was generated for protein using the FF14SB forcefield, while the corresponding ligand parameters file was generated using antechamber and parmchk modules in AmberTools18. Subsequently, the complex was neutralized by adding appropriate counter ions (Cl−) and was solvated in a cubic box using TIP3PBOX water molecule with the box’s edge at least 1.0 nm away from the complex. Energy minimization was achieved by employing the steepest-descent algorithm for 5000 steps, which was subsequently followed by the conjugate-gradient algorithm for an additional 5000 steps. Subsequently, pre-equilibration took place in two sequential steps, comprising a constant NVT ensemble (maintaining a constant particle number, volume, and temperature) and a constant NPT ensemble (maintaining a constant particle number, pressure, and temperature). The former indicates that the simulation system was heated up progressively from 0 to 300 K for 200 ps using Langevin thermostat, and the latter was used to perform pressure equilibration for 100 ps at 300K using Berendsen barostat. Finally, the position restraint was removed, and the production run was carried out for 100 ns with a 2 fs step. The RMSD, RMSF, H-bonds, and cluster analysis were calculated with cpptraj utility in AmberTools18. Furthermore, the binding energy of the complex was estimated for 250 snapshots using MM/GB(PB)SA methods implemented in the mmpbsa.py tool of AmberTools18.
2.4 Biochemical assay
The gene encoding human SHP2WT (1-593) and SHP2PTP (237–529) were inserted into the pET30 vector respectively. Two proteins were purified and expressed as described previously in the literature25. The allosteric activation of SHP2 was observed upon the binding of a phosphorylated peptide containing bis-tyrosyl to its Src homology 2 (SH2) domain. This activation process leads to the liberation of the inhibitory interface of SHP2, thereby enabling the accessibility of its PTP domain for substrate identification and catalysis. The rapid fluorescence assay was employed to detect the catalytic activity of SHP2, utilizing DiFMUP as a surrogate substrate. The experimental reagents prepared were as follows: 12.5 µL of buffer (75 mM NaCl, 75 mM KCl, 60 mM HEPES, pH 7.2, 1 mM EDTA, 0.05% P-20, 5 mM DTT), 2.5 µL of peptide IRS1_pY1172 (dPEG8) pY1222 (Shanghai Shengzhao Biotech Co., Ltd. cat. no. 79319-2), 2.5 µL of test compound, 2.5 µL of SHP2WT or SHP2PTP. And 20 µL reaction buffer was added to the 384-well plate (Perkin Elmer, cat. no. 6008260) and incubated for one hour at ambient temperature. Afterwards, the reaction was initiated by adding 200 µM surrogate substrate DiFMUP (Invitrogen, cat. no. D6567), followed by incubation at 25°C for an additional 30 minutes. Subsequently, a 5 µL solution containing bpV(Phen) with a concentration of 160 µM (Enzo Life Sciences cat# ALX-270-204) was added to halt the reaction. The fluorescence signal was then monitored by exciting the sample at 340 nm and measuring the emitted light at 450 nm. The inhibition rates were calculated by measuring the percentage of dephosphorylation of DiFMUP catalyzed by the enzyme. For each inhibitor, seven different concentrations were used to determine the IC50. SHP099 (Guangzhou Qiyun Biotechnology Co., Ltd., cat. no.1801747-42-1) was used as the positive control. Each experiment was performed at least three parallels.