Exploring the effectiveness of SARS-CoV-2 3CLpro inhibitors through molecular dynamics and binding free energy simulations

In December 2019, in the city of Wuhan in China, a novel Coronavirus was identied as the causative agent of a Severe Acute Respiratory Syndrome, later called Corona Virus Disease 2019 (COVID-19). Since the identication of the agent and sequencing of its genome, proteases such as 3CLpro and PLprothat participate in the viral cycle have been identied as possible pharmacological targets. This study aimed to evaluate the anity of SARS-CoV-2 3CLpro (PDB 6LU7) concerning promising binders identied by other studies using virtual screening against the ZINC database and other molecules within the possibility of inhibiting the protease, such as Hydroxychloroquine, Chloroquine, and Remdesivir. Around 1,140 ns of molecular dynamics simulations were performed to evaluate stability and binding free energy values of protein-ligand complexes (60 ns for each ligand). The estimated anity, based on 5000 frames trajectories, revealed that N3, 11b, remdesivir and a ZINC database ligand are the most promising inhibitors of 3CLpro.Given that most studies present energy results based on docking runs and one-frame coordinates analyses, the results found may present a more accurate energy value and may help experimental approaches to developing a drug against COVID-19. Despite the importance of vaccine development, alternative strategies, such as specic viral inhibitors, are important to reduce the impact of the disease on people.


Introduction
In December 2019 in Wuhan city in China, local doctors identi ed a new coronavirus as the causative agent of a severe acute respiratory syndrome type [1] later called Coronavirus Disease 2019 (COVID- 19). COVID-19 is a disease caused by SARS-CoV-2 [2], SARS are commonly characterized by high fever, malaise and non-productive cough and in some severe cases can cause generalized interstitial in ltration of the lungs, leading to the cases of intubation and mechanical ventilation of those affected [3]. From the appearance of the rst cases in China in December 2019 up to the month of July 2020, about15,785,641 cases and 640,016deaths had been reported [4]. Because of that, several studies have been performed seeking to nd a treatment or vaccine for the virus.
The genome of the human coronavirus has approximately 29.8 to 29.9kb ssRNA [5], which decodes various polypeptides that give rise to structural and non-structural proteins; these polypeptides are proteolytically cleaved to form several proteins. Some of these proteins are of fundamental importance for the viral cycle [6], such as two overlapping proteins called pp1a and pp1ab or replicases 1a and 1ab, with molecular weights of450kD and 750kD respectively [7]. These proteolytic processes are mediated by 3-chymotrypsin-like protease (3CLpro) also called Mpro and papain-like protease (PLpro) [8]. This information makes these proteases potential targets for anti-coronavirus inhibitors [2,6,9]. The crystallographic structure of 3CLpro revealed that the protein is composed of three domains: residues 8 to 99 make up the rst domain, 100 to 183 the second domain, both composed of beta antiparallel sheets; the third domain consists of ve helices that are responsible for the proteolytic activity of the protein (residues 200 to 300). A large looping of 16 residues (184 to 199) connects domains II and III [10].
After comparing with sequences of other viruses of the same family, such as mouse hepatitis virus (MHV), bovine coronavirus (BCoV) and SARS-CoV, it was observed that in both species the 3chymotrypsin-like protease had a high identity indices, 50%, 49% and 40% respectively [11]. This fact allowed the identi cation of the substrate binding site, located in a cleft between domains I and II, similar to cysteine proteases and serine proteases [10,11]. The catalytic site of HCoV 229E Mpro has histidine and cysteine (His41 andCys 144) differing only because it does not have the catalytic triad characteristic of this type of proteases, since that in the third position it has valine (Val84) instead of cysteine [6,10,11].
Virtual screening studies were realized to nd inhibitor candidates among thousands of molecules on compounds databases [12,13] and the results were promising. Here, the best candidates were selected based on docking results and toxicity predictions, for simulation by molecular dynamics and binding free energy calculations for corroborating their function as SARS-CoV-2 protease inhibitors.

Protein Structure
Currently, there are 57 structures of Mpro placed in the Protein Data Bank. This study, used the Mpro structure of SARS-CoV-2, achieved by X-Ray Diffraction, with 2.16 Å resolution (PDB code: 6LU7) [10]. The ligand selection was performed based on their toxicity levels according to the classi cation made by HOFMARCHER [14]. Ligands with toxicity values less than 10 were subjected to molecular docking.

Molecular Docking
The Template Docking function of Molegro Virtual Docking 5 software was used to determine the initial conformation of the ligands at the active site of the protease. MolDock scoring function with a grid resolution of 0.30 was used to rank the solutions, toggling on Internal ES, Internal HBond, and Sp2-Sp2 torsions. Docking results were further re ned by energy minimization and H-bonds to optimize hydrogen bonds after docking. Each docking had 10 runs, a population size of 100 and max iterations of 2000.

Molecular Dynamics
To determine the protonation state of the residues, we used the H++ server (http://biophysics.cs.vt.edu/) considering a pH of 7.0.The AMBER 18 package was used to perform all steps of preparation and molecular dynamics [22]. Cl-and Na + ions were used to neutralize possible system charges and the system was solvated using the TIP3P water model in a cubic box of8 Å. Sander performed the energy minimization in 5 stages. The rst four stages used 3000 cycles of steepest descent and 5000 cycles of conjugate gradients. In the remaining stage, 5000 cycles of steepest descent and 30000 cycles of conjugate gradients were used. The heavy atoms had their movement restrained by a harmonic potential of 1,000 Kcal/mol*Å 2 , while in the nal step all atoms were free. For heating and equilibration, 14 steps were used. A harmonic potential of 25 Kcal/mol*Å 2 was employed to restrain the heavy atoms in the initial stages. In step 13, the restraining force was set to 0. The temperature was gradually increased until it reached 300 K. Langevin Dynamics (thermostat) was employed with a collision frequency of 3.0 ps −1 .
The heating procedure lasted 650 picoseconds until step 13 and was performed using an NVT ensemble. Afterwards, a 2-nanosecond equilibration phase was employed in an NPT ensemble. The SHAKE algorithm was employed to restrict the vibration of the ligations of all hydrogen atoms. The Particle Mesh Ewald method was used for calculating electrostatic interactions using a cutoff value of 8.0 Å. Thus, 60 ns MD was produced in an NVT ensemble, for each system. The cpptraj module was used to compute the Root Mean Square Deviation (RMSD) of both trajectories, considering the heavy atoms of the main chain.For this study, ve thousand snapshots from the last 10 ns of the molecular dynamics simulations were used to calculate the free energies using MM-GBSA and MM-PBSA methods.For this study, ve thousand snapshots from the last 10 ns of the molecular dynamics simulations were used to calculate the free energies using MM-GBSA and MM-PBSA methods.

Results And Discussion
The dynamics results are available in gure 1, with residues that interacting with ligand, and the RMSD information on the N3, Remdesivir, and 11b ligands. The results show that the stabilizing control ligands have high stability with respect to the displacement of alpha carbons along the trajectory; the remdesivir ligand showed homogeneity in its RMSD along with the molecular dynamics simulation; however, it presented high values of RMSD.
The values of free binding energy calculated by the MMGBSA and MMPBSA methods are shown in table 1, the values demonstrate that the control ligands show a considerable a nity for the active protease site, while the virtual screening ligands did not present values that were so considerable.
Among the ligands selected by virtual screening and those analyzed by this study, the most promising was ligand ZINC000016317677, with energies of -38.5395 MMGBSA and -36.3917 MMPBSA.However, it does not have a nity signi cant enough to be considered an effective protease inhibitor and the other ligands coming from the virtual screening also did not show enough signi cant a nity to be considered Mpro inhibitors. This fact con rms the importance of using combined bioinformatic together since this is a purely in silico area, so the combined use of the available tools is of utmost importance in order for the results to be more robustly rati ed, providing solid results for beginning or not beginning in vitro studies, thus optimizing research time and reducing costs.

N3, 11b,chloroquine and hydroxychloroquine
The values of free energy generated based on the MMGBSA and MMPBSA methods from molecular dynamics revealed that the irreversible Mpro N3 and 11b inhibitor [10] has a signi cant a nity to block the active protease system, which in the energy decomposition showed signi cant interactions with residues HIS41, MET49, CYS145, MET165 and HIST163, HIE164, MET165 respectively, as shown in gure 1. Similar studies proposed N3 connection energy of approximately -64.32 kcal/mol, using the prime MMGBSA tool [13], while the one found by this study was -44.6905 kcal/mol in the MMGBSA method and -39,4006 kcal/mol in the MMPBSA method, slightly discordant values. It is noteworthy that the AMBER module responsible for computing the free energy performs the calculations based on trajectory data for 5000 snapshots,being able to perform calculations with more precision because it takes into account the conformational variations of the ligand and receiver over time [23]. Another method used to perform the solvation-free energy calculation is the Prime MMGBSA module [24].However, it does not use information from Molecular Dynamics trajectories, and performs the calculation without taking any into consideration. Because the DM simulations are closer to reality, the method performed by the AMBER module can therefore be considered more accurate. Furthermore, other ligands with possible in vitro antiviral activity such as HCQ and HC [16] did not show enough signi cant values of free energy to be considered enzyme inhibitors, which means that they are ine cient against 3CLpro. However, there are still other mechanisms proposed for the antiviral action of these drugs, such as an adsorption inhibitor [15], which prevents Spike proteins of the viral envelope from binding with the angiotensin-converting proteins ACE2 [25], or acting on intracellular acidic organelles that are important for membrane fusion, lysosomes/endosomes [16].

Remdesivir
The drug remdesivir has already been used against MERS and SARS and its mechanism of action is known to be a basic analog that interrupts the extension of the RNA strand in viral replication [25]. In this study, it was proven to be promising as an inhibitor of the studied protease, since it was the molecule that presented free binding energy values closest to the binding energy of the N3 and 11b inhibitor [10].The binder interacts signi cantly with residues ASN142, GLY143, HIS163, and MET165, and the binder interacts weakly with the catalytic residues of the HIS41 and SER144 active sites, but according to the free energy values of -40.2158 in the MMGBSA and -29.3688 MMPBSA, bringing up the hypothesis that it can be used as a drug to inhibit 3CLpro, signi cantly restrain viral replication and thus decrease the host's viral titration. In vitro [18] and in vivo [25] studies con rm the effectiveness of remdesivir as a viral inhibitor, such as the study published by CAO [25], in which there was a signi cant decrease of viral titration in mouse lung cells.

Conclusion
For this study 19 promising candidates to be SARS-CoV-2 Mpro inhibitors were chosen to be evaluated by molecular dynamics and an extensive binding free energy protocol. Because most studies present energy results based on docking runs and one frame coordinate analyses, the results found may present a more accurate energy value and may help experimental approaches to develop a drug against COVID-19.N3, 11b and remdesivir were con rmed as the best 3CLpro inhibitor candidates ;additionally, a ligand discovered by molecular screening seems may be a possible alternative. Chloroquine and hydroxychloroquinedid not present energy values that justify their use in 3CLpro. Therefore, MMPBSA.py script was shown to be a valuable tool for estimating the effectiveness of bioactive compounds.