Discovery, synthesis and in silico approach of pyrrolo [3,4-c]pyrroles as SARS-CoV-2 mpro inhibitors

Abstract A new coronavirus has been identified as the contributing agent of the severe acute respiratory syndrome (SARS). The main viral protease (Mpro), which controls the activities of the coronavirus replication complex, which is an essential target for the treatment of coronavirus disease. With the primary objective of targeting this receptor, we designed a new series of pyrrolo [3,2-c] pyrroles, synthesized and characterized using various analytical techniques including FT-IR, UV-Vis and NMR spectroscopic studies. The biological descriptors of the synthesized compounds were investigated using DFT calculation. The mode of binding and reactivity of the target compounds with SARS-CoV-2 main protease (Mpro) were studied using molecular docking and molecular dynamics (MD) simulation. Molecular docking of the compounds (4a and 5a) showed a promising binding affinity towards Mpro protein with the binding energy of −7.8 kcal/mol and −7.0 kcal/mol, respectively. The results of MD simulation and prime MM-GBSA calculation were consistent with molecular docking. The absorption, distribution, metabolism and excretion (ADME) properties of the compounds are in the acceptable range, as they are orally active and obey Lipinski’s rule of five without violation. In addition, in silico toxicity prediction using the Pro-Tox II revealed the non-toxic nature of the compounds. Hence the obtained results suggest that these compounds could be a possible anti-viral candidate and highlight this series of compounds for further drug design and development against SARS-CoV-2. Communicated by Ramaswamy H. Sarma


Introduction
In December 2019, COVID-19 (Coronavirus disease-19), a single stranded positive RNA virus referred as Severe Acute RespiratoPLry Syndrome coronavirus 2 (SARS-CoV-2) was first reported in Wuhan, China and now it is continuously spreading huge wings all over the globe nearly in 212 countries (M€ uller et al., 2021;Sarhan et al., 2021).Initially, bats, turtles, and pangolins were designated as a culprit with the emergence of COVID-19, genome sequencing revealed that it is closely related to bat coronavirus (96% identical) and it also shares a maximum sequence identity of 79.6% to SARS-CoV (Zhou et al., 2020).On March 11, 2020, the disease was declared a pandemic by the World Health Organization (WHO) as it causes significant health and economic impacts (Cucinotta & Vanelli, 2020).As of October 10, 2022, more than 627.112 million people were infected and 6.56 million people have died across the globe and representing an unprecedented challenge to public health.Therapeutic development strategies are of great importance in the current spread of COVID-19.Currently, several antiviral molecules such as hydroxychloroquine (Colson et al., 2020;Yao et al., 2020), RNA-dependent polymerase inhibitor remdesivir (Wang et al., 2020), have been utilized in the clinic, but none of them showed promising effects in clinical trials (Cavalcanti et al., 2020;Mehra et al., 2020;Wang et al., 2020).
Several proteins including helicase, membrane protein, hemagglutinin esterase, spike protein, nucleocapsid protein, protease, envelope protein, and 16 nonstructural proteins (NSPs) are considered a potential drug target against SARS COVID-19 (Prajapat et al., 2020).Among these proteins, protease (M pro ) is highly conserved among SARS-CoV-2 and SARS-CoV, they stand first as a potential therapeutic option for rational drug design of anti-coronavirus therapy.The active site of M pro contains a catalytic dyad (cysteine and histidine) at the 145 and 41 positions mediates the nucleophile function in the proteolytic process (Anand et al., 2003).In addition, it also showed a high degree of conservation in the binding pocket among various coronaviruses.Hence, in this study, we have targeted the M pro of SARS CoV-2 against the synthesized compounds.
In China, Lopinavir-Ritonavir drug formulation (Kaletra) should be given to hospitalized patients affected by COVID-19 and the advantages have been observed by World Health Organization (WHO) (Harrison, 2020).RNA-dependant RNA polymerase inhibitor Remdesivir, has been recognized as a potential therapeutic drug for the treatment of  based on in vitro studies of SARS-CoV-2 clinical isolates (Cao et al., 2020;Wang et al., 2020).Emergency Use Authorization (EUA) has been issued by FDA for the emergency use of Remdesivir based on a placebo-controlled randomized trial for coronavirus treatment (https://www.fda.gov/).In the clinical study, Remdesivir showed remarkable efficiency in lowering the rescue time in COVID-19 patients but unfortunately didn't contribute to substantial development in survival rates, and the efficacy of these drugs remains uncertain.Similarly, the FDA has also authenticated the anti-malarial drugs, chloroquine and hydroxychloroquine for the treatment of COVID-19 patients and still (Colson et al., 2020;Jaffe, 2020), the clinical efficacy of these drugs remains inconclusive.In addition, PAXLOVID TM , (PF-07321332; ritonavir) and molnupiravir drugs were approved and authorized for COVID-19 treatment.Several researchers have reported the biological significance of the heterocyclic azole compounds in anti-viral research, in particular to SARS-CoV-2.The compounds 5-(1H-indol-3-yl)-4-methyl-2,4-dihydro-3H-1,2,4-triazole-3-thione and 3- (2,3-dihydrobenzo[d]thiazol-2-yl)-4H-chromen-4-one were predicted to possess higher binding affinity with molecular targets of SARS-CoV-2 proteins (main protease and human angiotensin-converting enzyme-2).Interestingly, the binding efficiency of the synthesized compounds with the SARS-CoV-2 M pro had comparably equivalent or even sometimes higher than the standard drugs such hydroxychloroquine, chloroquine and remdesivir (Haribabu et al., 2022).A particular derivate of N-(5-nitrothiazol-2-yl)carboxamido (3b) has shown higher binding affinity and greater stability in molecular modeling studies.Further, experimental evaluation of the derivative 3b has shown promising inhibition against SARS-CoV-2 M pro and an anti-SARS-CoV-2 activity with IC 50 values of 11.90 mg/mL and 174.7 mg/mL, respectively (Elagawany et al., 2022).The several bicycloproline-containing inhibitors synthesized by Qiao et al have shown anti-viral activity against M pro of SARS-CoV-2 in in vitro studies with the IC 50 ranging between 7.6 to 748.5 nM, especially the two compounds MI-09 & MI-30 among others exhibited excellent antiviral and anti-SARS-CoV-2 infection activity in cell-line and mouse model experiments (Qiao et al., 2021).Six compounds derived from 2-phenyl-1,2-benzoselenazol-3-one were shown to possess a higher binding affinity with SARS-CoV-2 M pro , where the inhibition can be occurred either by covalent or noncovalent mediated mechanisms.The leads E24 and E04 were reported to inhibit the viral replication in Vero E6 cells infected by SARS-CoV-2, and further confirmed to hamper the replication of SARS-CoV-2 in lung epithelial cells and pluripotent stem cell-derived 3D lung organoids of human (Huff et al., 2022).Another group has reported that the synthetic compounds 3w and 3x have demonstrated excellent inhibition of SARS-CoV-2 M pro with the values of Ki ¼ 0.0141 lM and Ki ¼ 0.0332 lM, respectively.Crystal structural analysis of SARS-CoV-2 M pro with these synthetic compounds had covalent binding with the catalytically important residue Cys145.(Pillaiyar et al., 2022).Similarly, several studies have shown the inhibitory effect of synthetic compounds' anti-viral activity against M pro of SARS-CoV-2 in computational, cellline, and mouse model experiments.

Molecular docking
The crystal structure of the main protease from SARS-Coronavirus 2 was retrieved from Protein Data Bank (PDB) with PDB ID: 6LU7.The protein structure was prepared using a multi-step process in the protein preparation wizard of Schr€ odinger (Protein Preparation Wizard, 2020).The preparation process includes the correction of bond orders and partial charges, optimization of protein structure by using OPLS-3e force field, the addition of missing residues and hydrogen bonds, and energy minimization of the structure up to the maximum convergence of the threshold RMSD 0.30 Å.The Glide Grid generation panel in Schr€ odinger was used to produce the receptor grid panel to facilitate the molecular docking, which reduces the time spent for docking trails at non-specific sites.Prior to molecular docking, the ligand molecules were suitably prepared using the LigPrep module in Schr€ odinger.The molecules in two-dimensional conformations were converted to three dimensional conformers and the tautomeric states of the ligands were generated at the pH of 7.0 ± 2.0.The OPLS-3e force field was used to generate various conformations of each ligand molecule up to a maximum of 32 conformers.The glide module of Schr€ odinger was used to dock the prepared ligand molecules in the defined grid of the receptor protein.Glide XP docking (extra precision mode) was employed in order to ascertain the accurate prediction of the binding affinity (Prabhu et al., 2020).

Molecular dynamics simulation
To understand the complex stability and affinity between the synthesized ligand molecules (4a and 5a) and receptor protein M pro (PDB ID: 6LU7), Molecular Dynamics (MD) simulation was carried out for 100 ns using the GROMACS v2019.3package (Groingen Machine for Chemical Simulations) using GROMOS96 43a1 force field (Hess et al., 2008;Oostenbrink et al., 2005).PRODRG web-server was used to build the atomic charges and GROMACS topology files for the ligand (van Aalten et al., 1996).A simple point charge water model was used in the cubic dimensions of 10 nm 3 in all directions.The solvated system was neutralized by replacing water molecules with appropriate Na þ or Cl À ions.Energy minimization was performed using the Steepest Descent algorithm to minimize the energy of the system.Electrostatic interactions were computed by applying the Particle-Mesh Ewald (PME) method and the LINCS algorithm was implied to constrain the covalent bonds (Hess et al., 1997).The NPT and NVT (Number of particles, Pressure/Volume, and Temperature) canonical ensembles were computed.The Berendsen thermostat was used to increase the temperature of the system (300 K) and the pressure was maintained at 1 bar (Bussi et al., 2009).The MD simulation trajectory files were analyzed for RMSD, Root Mean Square Fluctuation (RMSF), Radius of gyration (Rg), and H-bond contacts using scripts.The trajectories were graphically represented using Origin Pro tool.

Binding free energy calculations
The binding free energies between the synthesized ligand molecules (4a and 5a) and SARS-CoV-2 M pro protein were computed using Schr€ odinger Prime MM-GBSA (Prime, 2019).Prime MM-GBSA uses the combined molecular mechanics energies, SGB solvation model for polar solvation, nonpolar solvation, and OPLS for all atoms to compute the energy profiles.The binding free energy (DG bind ) of the ligandbound complex was computed using the expression provided below: The GBSA solvation energy difference between M Pro -the ligand molecules and as well the sum of salvation energies of the M Pro and the ligand molecules are represented as DG solv .The minimized energy lies between the M Pro and the ligand molecules are represented as DE MM .The difference in the surface energies of the M Pro -the ligand complexes and the summation of M Pro and the ligand molecules' surface energies.MM-GBSA analysis reveals the stabilizing factor and energy parameters of the protein-ligand complexes.

Density functional theory calculation
The Jaguar module of Schr€ odinger was used to perform density functional theory calculations (Jaguar, 2017), ( Rajamanikandan & Srinivasan, 2016).For ligand optimization, a basic set of 6 31 G � þ level, the hybrid density functional theory with exchange potential (Becke's three parameters exchange potential), and the Lee-Yang-Parr correlation functional (B3LYP) gradient were used.The compounds' energy was minimized using the Poission Boltz-mann Finate (PBF) implicit solvation method (Lee et al., 1988;Samurkas et al., 2020).The values of the Highest Occupied Molecular Orbit (HOMO), Lowest Unoccupied Molecular Orbit (LUMO), and energy gap were calculated based on the electronic features of the compounds.

Results and discussion
The choice of an appropriate reaction condition is one of the important keys to successful organic synthesis.The synthetic route was based on a multicomponent reaction strategy involving 1,3-dipolar cycloaddition reaction between maleimide derivatives as dipolarophile and azomethine ylide generated in situ from ninhydrin and benzylamine as depicted in Scheme 1.The cycloaddition reactions were achieved by refluxing equimolar quantities of reactants ninhydrin (1 mmol), benzylamine (1 mmol) and maleimide derivatives (1 mmol) in methanol.Upon completion, the reaction was monitored by TLC, the solvent was evaporated and the remaining precipitate was washed with water to obtain the crude product in good yields (90-92%) (Table 1).This reaction progressed in a selective regio-and stereo controlled type, as only one diastereoisomer, was obtained.
A plausible mechanism for the formation of pyrrolo[3,2c]pyrroles was depicted in Scheme 3. In the proposed mechanism, it is predicted that condensation has taken place between ninhydrin and benzylamine/sarcosine, followed by decarboxylation to form the azomethine ylide.In the next step, the azomethine ylide undergoes in situ [3 þ 2] cycloaddition reaction with maleimide to yield cycloadduct pyrrolo[3,2-c]pyrroles in a concerted manner.

Molecular docking studies
Forthcoming studies reveal that the SARS-CoV-2 virus has a comparable pattern with early existing coronaviruses.Mainly the SARS-CoV-2 virus had two main proteases namely main protease (M pro ) and papain-like protease (PL pro ) to complete its life cycle (Elmezayen et al., 2021;Elseginy et al., 2021) These proteases are playing a vital role in the replication process of the SARS-CoV-2 virus; this point reveals that the targeting of M pro and PL pro proteases is very important in the prediction of anti-SARS-CoV-2 therapeutics.Especially, the main protease (M pro ) �96% structure similarity with the already existing SARS-CoV virus, so docking was carried out against M pro is mainly targeted in this article.The newly released crystal structure of SARS-CoV-2 M pro protein (PDB ID: 6LU7) was selected for the docking studies of synthesized compounds 4a and 5a (Figure 4).Docking analysis of 4a inside the binding site of 6LU7 showed the glide score of À 7.82 kcal mol À 1 and glide energy of À 43.88 kcal mol À 1 (Table 3).Binding mode analysis revealed four hydrogen bond interactions with the amino acid residues such as Asn142, Gln189, Glu166 and His164 as shown in   1.93 Å) and His164 (OH … O ¼ C, bond length ¼ 2.07 Å).Further, the amine group of 4a showed hydrogen bond interaction with the carbonyl group of Glu166 with a bond distance of 2.28 Å.In the case of 5a-6LU7 complex, the glide score and glide energy were found to be À 7.02 kcal mol À 1 and À 28.71 kcal mol À 1 (Table 3).Binding mode analysis of 5a-6LU7 showed three hydrogen bond interactions with the amino acid residues such as Glu166 (NH … O ¼ C, bond length ¼ 2.22 Å), Glu160 (OH … HN, bond length ¼ 2.33 Å), His164 (OH … O ¼ C, bond length ¼ 1.65 Å).The binding mode interaction analysis of 4a and 5a with the active site residues of 6LU7 was shown in Figure 3. Based on the docking results, it was observed that both molecules showed a similar kind of binding orientation and the amino acid residues such as Asn142, Gln189, Glu166, His164, and Glu160 plays a crucial role in the stabilization of the ligand molecules inside the binding site of 6LU7.

ADME prediction result
The pharmacokinetic analysis of our synthesized compounds 4a and 5a were calculated by using SwissADME online tool.Investigation of absorption, distribution, metabolism, and excretion properties play a major role in drug discovery and development and the results were summarized in Table 4. Compounds 4a and 5a showed Log P values less than five, revealing good cell permeability.The percentage of absorption (% ABS) ranges from 77 to 80 showing good bioavailability.Similarly, the TPSA value of 92.34 and 83.55 denotes good solubility.The PAINS and Brenk filter showed that the 1, and 2 violations reveal that the compounds 4a and 5a have good pharmacokinetic properties.Compound 5a had one violation of the Ghose rule, otherwise, the compounds should obey all the other rules and they have a better value in their solubility parameter.
The result, clearly explains the compound has a better effect on the Ion-channel modulator, Kinase inhibitor, nuclear receptor ligand, G-protein coupled receptor, protease, and enzyme inhibitor.The compounds were found to be nontoxic in nature from the ProTox II computational prediction.

Molecular dynamics simulation
Understanding the structural stability of protein-ligand complexes was studied using MD simulation of 100 ns.RMSD, RMSF, and protein-ligand contacts are normally used to estimate the dynamic behavior and stability of the docked complexes in the timescale studied.The RMSD profile of the protein-ligand complexes is illustrated in Figure 5a.In the case of the 4a-6LU7 complex, the initial rise in the RMSD was observed up to 2.5 ns, mainly due to the absence of restraining the production phase of the MD simulation.Further, the protein showed many up and downs in the RMSD up to 40 ns, after that, the protein showed stable RMSD around 0.45 nm for the remaining MD simulation indicating the equilibration and stability.The average RMSD and SD values for the 4a-6LU7 complex were 0.43 nm and 0.06 nm.In the case of 5a-6LU7 complex, no maximum rise in the RMSD was noted as compared with 4a-6LU7.The RMSD curve showed minimal deviation throughout the MD simulation.The average RMSD with the SD values was observed to be 0.25 nm and 0.03 nm.The least RMSD data with the minimal deviation between the two complexes confirms the stability of the protein-ligand complexes in the time scale studied.To understand the flexibility of amino acid residues in the protein, c-alpha atoms were analyzed with respect to the reference structure during the simulation time (Figure 5b).None of the amino acid residues showed greater fluctuations beyond 0.8 nm, except one amino acid Ser1 in both the complexes showed the fluctuation rate of 0.56 nm and 0.80 nm, respectively.In complex 1 (4a-6LU7), the amino acid residues such as Ser1 (0.56 nm), Gly2 (0.36 nm), Ala194 (0.26 nm), Gly195 (0.28 nm), Thr196 (0.26 nm), Asn274 (0.26 nm), Gly275 (0.28 nm), Asn277 (0.34 nm), Gly278 (0.37 nm), Arg279 (0.28 nm), Ser301 (0.27 nm), Phe305 (0.30 nm) showed higher residual fluctuations.Interacting amino acids in 4a such as Asn142 (0.21 nm), Gln189 (0.17 nm), Gu166 (0.12 nm), His164 (0.08 nm) showed minimal fluctuation over the MD simulation.In the 5a-6LU7 complex, the major fluctuation was observed in the amino acid residues which include Ser1 (0.80 nm), Gly2 (0.58 nm), Phe3 (0.40 nm), Glu47 (0.27 nm), Ser139 (0.27 nm), Leu141 (0.31 nm), Asn142 (0.32 nm), Met276 (0.28 nm), Asn277 (0.29 nm), Thr304 (0.34 nm) and Phe305 (0.41 nm).Interacting amino acid residues such as Glu166, Glu160 and His164 showed very minimal fluctuations of 0.11 nm, 0.09 nm and 0.09 nm, respectively.The small fluctuations observed in the interacting residues denotes the stable binding of ligands with 6LU7.The compactness of protein-ligand complexes was estimated through Rg (Figure 5c).The 5a-6LU7 complex showed wider fluctuation upto 80 ns and then reached stability at the end of the simulation.But this behavior was not noted in 4a-6LU7 complex, where the complex exhibited minimal fluctuation upto 26 ns, after that, it showed stable fluctuation over the simulation time.The average Rg and SD values for 4a-6LU7 and 5a-6LU7 were observed to be 2.10 nm, 2.13 nm, and 0.02 nm, 0.02 nm respectively.The least Rg value reflects the compactness of the protein-ligand system.Estimating the hydrogen bonds in the protein-ligand complex plays a major role in understanding the structure, function, enzyme inhibition, and conformation behavior of the macromolecules.Time-dependent hydrogen bonds were estimated for the protein-ligand complexes during the simulation time (Figure 5d).
Hydrogen bond analysis reveals that a maximum and average of 6 and 1 hydrogen bond interactions were observed for the 4a-6LU7 and 5a-6LU7 complexes.All these observed hydrogen bonds play a major role in the stability of the protein-ligand complexes.In addition, the stability of the complex was estimated through Prime MM-GBSA calculation.The binding free energy calculated for 4a-6LU7 at different trajectories ranged from À 33.91 kcal/mol to À 17.24 kcal/mol, whereas in the case of 5a-6LU7 it ranged from À 48.51 kcal/mol to À 37.32 kcal/mol, respectively.From the results, it is clear that the free energy of binding values is higher for 5a compared to 4a.In addition, coulomb energy, van der Waals, covalent energy, and solvation energy are considered to be the driving forces of ligand binding inside the binding site of 6LU7 (Table 5).

Density functional theory calculation
Figure 6 represents the density functional state of compounds 4a and 5a.The ionization potential and electron affinities of the small molecules are usually expressed through HOMO and LUMO orbital energies.In which HOMO and LUMO function as electron donor and acceptor, respectively (Al-Sabagh et al., 2013).The energy gap between (EHOMO-ELUMO) determines the reactivity and chemical stability of the molecule.HOMO and LUMO energy profile of the compounds ranges between À 0.208 eV, À 0.184 eV and À 0.247 eV, À 0.186 eV, respectively indicates the fragile nature of the bound electron.EHOMO-ELUMO energy gap varies between 0.02 eV (4a) and 0.06 eV (5a).The low energy gap indicates the polarizability, reactivity and stability of the compounds.In 4a, the HOMO and LUMO orbitals are located on the ninhydrin core.In case of 5a HOMO map is located in the same ninhydrin core explains the electron donating and withdrawing groups in the compounds.

Conclusion
This report details an attempt to identify potential inhibitors of SARS-CoV-2 M pro for the treatment of SARS and other coronavirus diseases.The synthesized pyrrolo [3,2-c] pyrroles were subjected to in silico molecular docking with SARS-CoV-2 M pro (6LU7) and the results showed good binding affinity towards SARS-CoV-2 protein with the binding energy of À 7.82 kcal/mol and À 7.02 kcal/mol respectively.The binding affinity between M pro and synthesized compounds was verified using MD simulation prime MM-GB-SA calculation.The important functional group responsible for the bioactivity of the compounds was assessed using DFT prediction.ADME studies revealed that all compounds are likely to be orally active as they obey Lipinski's rule of five and show good bioavailability.The present work is a successful attempt to identify novel anti-SARS-CoV-2 M pro inhibitors.

Disclosure statement
No potential conflict of interest was reported by the authors.

Funding
The author acknowledges the management board and Department of Chemistry, Karpagam Academy of Higher Education, Coimbatore, Tamil Nadu, India for financial support for this work.The author also acknowledges the Sophisticated Analytical Instrumental Facility (SAIF), Cochin University of Science and Technology (CUSAT), Kochi for providing NMR facility.

Figure 2 .
Figure 2. 1 H and 13 C NMR spectral characterization for the products 4c and 5b.

Figure 3 .
Figure 3. XRD structure of compound 5c, the hydrogen atoms in the phenyl groups are omitted for clarity.
a Contribution to the MMGBSA free energy of binding from the coulomb energy.b Contribution to the MMGBSA free energy of binding from the van der Waals energy.c Contribution to the MMGBSA free energy of binding from covalent binding.d Contribution to the MMGBSA free energy of binding from lipophilic binding.e Free energy of binding.

Table 3 .
Docking results of the synthesized compounds 4a and 5a.

Table 4 .
ADME prediction of 4a and 5a through SwissADME online tool.

Table 5 .
Binding free energy calculation (kcal/mol) of 6LU7 with 4a and 5a for different trajectories.S.No Trajectories DG coulomb