Binding mechanism of selective cathepsin K/S inhibition revealed from molecular simulations

Cathepsin K and S are two isoforms of cysteine protease with diverse biological functions in the aspect of osteoporosis and autoimmune diseases. Accordingly, the homologous sequence and similar binding site features among CTSK/S may lead to unselective inhibition and side effects. To address such issue, various computational strategies were applied in the current study to explore the selectivity mechanism of CTSK/S inhibitors, including sequence alignment, molecular docking, MD simulations, MM/GBSA energy calculation, and so on. Our findings highlight the notable effects of CTSK residues Glu59 and Tyr67, as well as CTSS residue Asn67, on inhibition selectivity. Overall, this study provides an informative guideline for the rational design of CTSK/S selective inhibitors.


Introduction
Cathepsins (CTS) are closely related to many human diseases and are currently attracting much attention. More than 20 CTS have been identified in the biological world, 11 of which are present in the human body. It has been proved that CTS overexpression promotes the development of diseases such as neurodegenerative diseases [1], osteoporosis [2], and autoimmune diseases [3], which has caused a large number of scholars to conduct extensive research on CTS inhibitors. Due to the highly conserved active site of CTS [4], research interest in recent years has focused on selective inhibition [5][6][7].
Cathepsin K and S (CTSK and CTSS) are two isoforms of cysteine proteases involved in different human physiological functions. CTSK was found in cells like osteoclasts and macrophages, as well as playing a major role in osteoclastic bone resorption for type I collagen degradation [8]. Variants in the expression of the human CTSK gene lead to dense osteogenesis imperfecta [9,10], implying that CTSK inhibitors may serve as an effective antiresorptive therapy for osteoporosis [11] and osteoarthritis [12].
While CTSS is mainly in lymph nodes involved in the occurrence and development of diseases, including immune system [13], pulmonary fibrosis [14], cardiovascular [7], and cancer [15]. CTSS is highly expressed in the major histocompatibility complex (MHC) II and plays a key role in antigen presentation [16,17]. Sophia [3] et al. found that CTSS inhibition suppresses the inflammatory response resulting from autoimmunity. Mice in which the CTSS gene was knocked out exhibit greater resistance than wild-type mice in the development of autoimmune diseases [18]. Thusly, CTSS is considered a potential target for the treatment of autoimmune diseases [19].
Odanacatib is the only CTSK inhibitor so far that has entered phase III clinical trials (NCT01803607, date of registration: 28/02/2013), but its development was discontinued Qinyi Zhong and Jiasi Luan contributed equally to this paper. in 2016 as a result of its ability to reduce fracture risk while leading to an increased risk of cardiovascular disease, particularly stroke [20]. Considering the side effects of CTSK/S homology, it is necessary to develop novel CTSK/S inhibitors with high selectivity. Two target compounds were screened from 17 compounds (Table S1). Among them, compound 1 is a potent small molecule inhibitor binding to CTSK [21]; correspondingly, compound 2 selectively inhibits CTSS [22] (Fig. 1). To reveal the mechanism of selective inhibition, we conducted a systematic study of CTSK/S and their complexes. By comparing the protein structures of CTSK and CTSS (Tanimoto scores: 0.036), analyzing the interaction patterns between CTS and inhibitors, and applying multiple computational methods [23,24] to verify the structural basis of selective inhibitions of CTS, which will provide a hint for the design and development of effective and selective inhibitors in future.

Protein and ligand preparation
The CTSK and CTSS crystal structures were downloaded from the RCSB PDB databank (http:// www. rcsb. org) with the PDB codes 1VSN and 2R9M, while protein sequences were retrieved from the Uniprot (https:// www. unipr ot. org/) and aligned by applying Discovery Studio and PyMOL. Through the Protein Preparation Wizard module within Schrödinger package 2020, the protein structures were protonated and minimized at pH 7.0 ± 0.2 while removing water molecules and adding hydrogen atoms. Ligand structures were prepared and optimized by using the LigPrep module of the Schrödinger package.

Protein contacts atlas
The crucial residues of CTSK and CTSS were identified via the Protein Contacts Atlas program (http:// www. mrc-lmb. cam. ac. uk/ pca/).

Molecular docking
Docking of inhibitors and CTSK/S were implemented by Glide module in the Schrödinger package [25]. Grid files were generated within 20 Å of any specified position of the ligands, then extra precision (XP) and flexible docked conformational modes were selected. Docking results were evaluated and obtained using multiple scoring functions, including Glide GScore, Glide Energy, and Glide Emodel.

Molecular dynamics simulation
Molecular dynamics (MD) simulations of CTSK and CTSS complexes were performed by Desmond in the Schrödinger suite 2020. Firstly, the system was set up as a simple point charge (SPC) solvent model and OPLS_3e force field [26], with Na + and Cl − ions added to keep the system electroneutral. Subsequently, the MD simulation of 100 ns in NPT mode with a maximum iteration for minimization set to 2000 was monitored at 300.0 K and 1.01325 bar, and the time step was chosen to be 2 fs. The root mean square deviation (RMSD) and root mean square fluctuation (RMSF) were calculated to examine the structural stability of the protein-ligand complexes over time.

Dynamic cross correlation matrix analysis
Dynamic cross-correlation matrix (DCCM) analysis is expected to quantify the correlation coefficient of movement between atoms. Relying on displacement from a uniquely determined average coordinate, DCCM analysis gives insight into the correlated motion of the atoms. The DCCM between the i th and j th atoms is defined by the following equation [27]: where r i (t) represents a vector of the ith atomic coordinates as a function of time t; ⟨⋅⟩ t represents the time ensemble average and Δr

Molecular mechanics/generalized born surface area calculation
The trajectories of the ligand-protein complexes during the molecular dynamics simulation process were extracted, and the binding free energy was calculated by the prime module of Schrödinger. Parameters were set to the variable dielectric generalized-born (VSGB) solvent model and OPLS_3e force field, the energy calculation involves the following formulas [28]. To avoid solvent to solvent interactions, the second formula is usually adopted for practical calculations, among which, ΔG bind,solv is the binding free energy of the complex in solvent; ΔG bind,vacuum , ΔG complex,solv , ΔG receptor,solv , and ΔG ligand,solv refer to the binding free energies of the receptor and ligand in the eukaryotic complex, solvent, respectively; ΔG solv,polar and ΔG solv,nonpolar refer to the solvation free energies of polar and non-polar, respectively.

Alanine scanning mutation analysis
Alanine scanning mutagenesis (ASM) is a technique to study the extent to which specific residues contribute to the free energy in enzyme-substrate interactions. Taking advantage of the Schrödinger package to mutate specific residues on the non-main chain of the protein to alanine without affecting the protein conformation, by combining the principle of

Quantum mechanics/molecular mechanics minimization
Quantum mechanics/molecular mechanics (QM/MM) combines the accuracy of QM calculations and the speed Background in blue of RMSF plots indicates residues in the β-sheet and the red indicates residues in the α-helix secondary structure. Compound 1 is shown in purple and compound 2 is displayed as yellow advantage of MM to study molecular interactions between protein and inhibitor. Specifically, the ligands and related residues were set up as the QM region with zero charge and other atoms as the MM region. The DFT-B3LYP method was used with the MM force field set to OPLS_3e via the Qsite module of Schrödinger, and default values were chosen for the remaining parameters.

Structural comparisons between CTSK and CTSS
To compare the structural similarities and differences between CTSK and CTSS, the crystal structures of CTSK (PDB code: 1VSN) and CTSS (PDB code: 2R9M) were aligned using PyMOL and Discovery Studio. As demonstrated in Fig. 2, CTSK exhibited a similar tertiary structure as CTSS, with an amino acid sequence identity of 55.0% and an overall similarity of 71.1%. Meanwhile, several important amino acids in the CTSK active pocket, such as Glu59, Gly64, Gly66, Try67, and Asn158, overlapped notably with the corresponding residues Tyr61, Asn67, Gly69, Phe70, and Asn163 in CTSS (Fig. 3), suggesting that the residues within CTSK/S pockets are highly identical. Remarkably, CTSK residues Cys25, His159, and Asn175, as well as CTSS residues Cys25, His164, and Asn184, all exerted proteolytic effects in the cleft between the N-lobe and C-lobe segments. In a word, CTSK and CTSS

Binding affinity and modes analysis of CTSK and CTSS complexes
Re-docking of the CTSK/S co-crystalline complexes generated RMSD values of 0.916 and 0.541 Å for CTSK (PDB code 1VSN) and CTSS (PDB code 2R9M), indicating the docking process is credible. As disclosed from the results in Table 1, the docking scores of the complexes were consistent with bioactivity results, revealing that compound 1 shows a better binding affinity toward CTSK whereas compound 2 prefers to interact with CTSS. As shown in two-dimensional receptor-ligand interaction diagrams (Fig. 4A), the phenyl of compound 1 formed π-π stacking with CTSK residue Tyr67, while the carbonyl oxygen and the positively charged tertiary amine on piperazinyl of compound 1 formed hydrogen bonds with Gly66 and Glu59, respectively. Comparatively, compound 2 only formed one hydrogen bond with CTSK residue Gln19 (Fig. 4B), suggesting compound 1 exhibited prior binding toward CTSK other than compound 2. While for the CTSS system, only one hydrophobic contact was formed in CTSS/ compound 1 complex (Fig. 4C), but various interactions were established for CTSS/compound 2 complex, such as π-π stacking of the phenyl with Phe70, hydrogen bonds of the sulfone group with Gly69, and acylamino with Asn67 (Fig. 4D), showing compound 2 fits CTSS cavity better than compound 1. In summary, residues Glu59, Gly66, and Tyr67 are crucial for CSTK ligand binding, while residues Asn67, Gly69, and Phe70 contribute significantly to CTSS ligand interaction.

Analysis of molecular dynamics trajectories
To evaluate the conformational stability of CTSK/S complexes, all systems were evaluated using RMSD values of protein α-carbon (Cα) atoms (Fig. S1). As shown in Fig. 5, the RMSD values of all the CTSK/S complexes eventually leveled off after slight fluctuations, indicating their conformations were stable during the MD simulation.
To further investigate the structural flexibility of the local protein, RMSF values were calculated by comparing the transient position of residues to the average one. As displayed in Fig. 6, similar RMSF curves were characterized for CTSK/S complexes, suggesting that different ligands exhibit similar binding modes within CTS sites. Besides, lower peaks of conserved residues in CTSK, such as Cys25, Gly66, Tyr67, His159, and Asn175, as well as CTSS residues like Cys25, Asn67, Gly69, His164, and Asn184, illustrated constrained protein conformation upon ligand binding.
For the CTSS complexes (Fig. 8), compound 2 formed more stable hydrogen bond interactions with CTSS residues Gly69 and Asn67 compared to compound 1, as also verified by hydrogen bond distance detection (Fig. 8E), indicating that compound 2 showed better binding affinity toward CTSS than compound 1. Briefly, CTSK residues Glu59 and Tyr67, as well as CTSS residue Asn67, have an effect on selective inhibition.

Dynamic cross correlation matrix analysis
The dynamic correlation matrix (DCCM) plot described the fluctuating correlations of the Cα atoms of the complexes during the MD simulation. As disclosed in Fig. 9, the CTSinhibitor complexes showed a significantly correlated motion compared to the apo CTS structure. In addition, the CTSK residues C25-E59, T139-N158, and H170-N175, as well as CTSS residues C25-F70, S135-N163, and N184-Y195,

Binding free energies calculation for CTSK/S and inhibitors
To investigate the binding affinities of the inhibitors toward CTS proteins, the binding free energies during the 100 ns MD simulations were decomposed and calculated. As presented in Table 2 and Fig. 10
In addition, residues Asp61 CTSK , Asn158 CTSK , Asn67 CTSS , and Phe211 CTSS showed negative ΔΔG values, indicating their side chains could lead to spatial resistance during interaction with the inhibitors. The mentioned results further illustrate that CTSK residues Glu59 and Tyr67, as well as CTSS residue Asn67, affect the selectivity of inhibition of CTSK/S.

Pharmacophore features for CTSK/S selective inhibitors
To elucidate the chemical characteristics necessary for the binding of the inhibitors to CTSK/S, the complexes obtained from MD simulations were generated as three-dimensional pharmacophores. As shown in Fig. 12, the phenyl group of compound 1 exhibited a hydrophobic interaction, and the cyano group of compound 1 acted as a hydrogen bond acceptor. While for compound 2, both the trifluorotoluene and the trifluoromethyl groups were hydrophobic spheres, as well as the sulfone and the amino group acted as hydrogen bond acceptors and donors, respectively. Moreover, the MD trajectories of the favorable complexes of CTSK/compound 1 and CTSS/compound 2 were clustered, and the obtained conformations were analyzed pharmacologically, proving the reliability and validity of the above (Fig. S3). In conclusion, hydrophobic compounds with a sulfone and amino group as hydrogen bond acceptors and donors can selectively bind to CTSK/S receptors.

Quantum mechanics calculation
Natural bond orbitals (NBO) and Mulliken calculations were performed to explore the effect of inhibitors on hydrogen bond formation. As disclosed from the local charges of the inhibitors shown in Fig. 13, the H38, H51, and H52 atoms of compound 1 displayed positive charges as electron donors, while the O23 and O24 atoms were negatively charged as electron receptors. Likewise, the positively charged N22 in compound 2 served as a hydrogen bond donor, and the negatively charged O29 and O30 served as hydrogen bond acceptors.

Fig. 13
Atomic charges calculated with NBO and Mulliken methods QM/MM calculations were carried out to obtain frontier molecular orbitals HOMO-LUMO. As displayed in Fig. 14, the HOMO orbital of CTSK/compound 1 was located in the phenyl region, and the LUMO orbital focused on the phenylamide group, indicating that the phenyl group is prone to electron transfer. Correspondingly, the HOMO of CTSS/compound 2 was distributed in the amide group and the LUMO was mainly in the aryl group, revealing that the amide group was the key factor to generate selectivity for CTSK/S. In addition, as revealed from molecular electrostatic potential (MEP) plots (Fig. 14), the amide nitrogen of compound 1 was surrounded by positive charges, thereby serving as a hydrogen bond donor to bind with CTSK, while the secondary amino group of compound 2 was a region of high electron density and formed hydrogen bond with CTSS. In a nutshell, these data further support the conclusion that hydrogen bonds between inhibitors and targets are required.

Conclusion
The current study provides valuable information for understanding the selective inhibition mechanism of CTSK/S, revealing CTSK residues Glu59 and Tyr67, as same as CTSS residue Asn67, are key factors contributing to the selectivity of the isoforms, which is of great and profound significance for further development of highly potent and selective CTS inhibitors.