Cytidine Derivatives as SARS-CoV-2 Mpro Inhibitors: Antiviral Prediction, Molecular Docking and Pharmacokinetic Score Investigations

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a beta coronavirus that was rst found during the Wuhan COVID-19 epidemic in 2019 and is listed as a potential global health threat by World Health Organization due to its high mortality. The main protease of SARS-CoV-2 is one of the optimum targets for antiviral drug design and development. Nucleoside derivatives have been investigated since many years, and some of the most clinically effective antiviral agents used currently include purine or pyrimidine nucleoside derivatives. In this study, the hydroxyl (–OH) groups of cytidine structures were modied with different aliphatic and aromatic groups to obtain 5´-O-acyl and 2´,3´-di-O-acyl derivatives, and then, these derivatives were employed for molecular modeling, molecular docking, antiviral prediction, and pharmacological studies. Density functional theory at the B3LYP/3-21G level was employed to analyze the thermochemical stability and molecular electrostatic potential of the modied derivatives to evaluate the effect of the aliphatic and aromatic groups on the drug properties. All the derivatives were more stable than their parent molecule, cytidine. The experimental and computed IR analyses showed the characteristic peaks for various aliphatic and aromatic groups. The antiviral parameters of the modied derivatives revealed promising drug properties compared with those of standard antiviral drugs. Molecular docking was performed using AutoDock Vina to determine binding anities and interactions between the cytidine derivatives and SARS-CoV-2 main protease. The modied derivatives strongly interacted with prime Cys145 and His41 residues. Finally, the pharmacokinetic characterization of the optimized inhibitors showed the derivatives to be safe due to their improved kinetic properties. Our comprehensive computational and statistical analyses showed that the selected cytidine derivatives can be used as potential inhibitors against SARS-CoV-2. The predicted antiviral activities revealed that the modied cytidine derivatives (2–15) exhibit potential antiviral ecacy compared with their parent molecules. The aliphatic derivatives (2–4) and aromatic derivatives (8, 11, 14, and 15) exhibited more promising scores than aliphatic derivatives (3–5) along with standard drugs remdesivir and azidothymidine (AZT). CYS145, HIS246, and VAL297. Four hydrogen bond contacts occur with four different amino acids, ASN151, ILE152, and GLN110 at distances of 2.526, 2.814, 2.417, and 2.282 Å, respectively. Compound (14) exhibited an additional benzene ring in cytidine, providing a high density of electrons in the molecule and the highest binding score. These results indicated that modication of the –OH group along with long carbon chains/aromatic ring


Introduction
Nucleoside agents (NAs) are the subunits of DNA and RNA and comprise a sugar moiety connected to a nitrogen base through an N-β-glycosidic bond [1]. NAs have considerable clinical importance as medicinal agents due to their antiviral and anticancer activities [2] and are the drugs of choice for treating various viral diseases, such as herpes simplex (HSV-1), human cytomegalovirus, varicella-zoster, human immunode ciency virus (HIV) type-1, human hepatitis B (HBV) and C (HCV) [3], ebola [4], dengue [5], and Zika [6]. Additionally, 2 -deoxynucleosides such as doxuridne, tri uridine, doxudine, vidarabine, and brivudine are used to treat herpes virus infections [7,8]. Certain 2 ,3 -dideoxynucleosides such as zidovudine, didanosine, zalcitabine, stavudine, and abacavir are the most effective therapeutic agents against HIV [9]. Modi cations in sugar moieties, such as ribofuranose or deoxyribofuranose of nucleosides, include changes in sugar substituents, the replacement of oxygen with another atom, the addition of a heteroatom in the sugar ring, ring size variations, and replacement with an acyclic moiety [10][11][12][13][14][15]. These alterations may lead to excellent variations in the biological activity and degree of selective toxicity according to the respective chemical and physical properties of the moieties [16][17][18][19][20][21][22]. The modi ed compounds exhibit a broad-spectrum biological activity. For example, zidovudine with an azido group at 3 -position is used to treat HIV. Thymidine (1) derivatives such as telbivudine are antiviral drugs are used in HBV treatment [23]. Azidothymidine (3 -azido-2 ,3 -dideoxythymidine) is another thymidine analog used in HIV treatment. The supplementation of dietary cytidine (5´)-diphosphocholine protects against the development of memory de cits [24]. Cytidine is present in organ meats and pyrimidine-rich foods such as beer, tomatoes, broccoli, and oats. Cytidine is an RNA component that transfers instructions from DNA to proteins [25]. When RNA levels decrease, cytidine is supplemented to maintain high RNA levels for a high memory function. Another important function of cytidine is to increase dopamine production and release it in the brain. Cytidine is a powerful neurotransmitter responsible for regulating functions, such as mood and movement control.
Nucleoside analogs and nucleobases constitute a pharmacologically diverse family, which includes cytotoxic compounds, antiviral agents, and immune suppressive molecules [26][27][28][29]. Cytidine analog 5-AZA-2´-deoxycytidine is utilized to control the growth of neuroblastoma malignant tumors [30]. Cytidine analog KP-1461 is an anti-HIV agent that acts as a viral mutagen [31]. Various cytidine derivatives modi ed at the base or ribose exhibit antiviral or antitumor activities.
The recent outbreak of the novel coronavirus disease , caused by a severe acute respiratory syndrome (SARS)-like coronavirus, that started in Wuhan, China, is spreading rapidly in humans; this outbreak is now considered a global pandemic [32]. Modi cations of the hydroxyl (-OH) group of the nucleoside structure showed some potent SARS-CoV-2 candidates [33][34] and antimicrobial agents. The COVID-19 outbreak caused by the new coronavirus, which appeared in China, remains a serious problem worldwide. Although SARS-CoV and SARS-CoV-2 agents belong to beta-coronaviruses category, they slightly differ from each other. Studies have shown that SARS-CoV-2 shares 80% nucleotide identity and 89.10% nucleotide similarity with SARS-CoV. Thus, the main protease of SARS-CoV, 3CL pro , is the target of several in silico investigations for developing potential inhibitor candidates. Between nCoV and nCoV2, 3CL pro provides a high sequence identity rate; hence, their 3CL pro is likely homologous and has similar structure and functions. Furthermore, SARS-CoV and SARS-CoV-2 affect cells in the same manner and employ the same protein machinery to enhance inside the host cell. Due to their features, we explored the molecular electrostatic potential (MEP) and biochemical behavior of several previously synthesized cytidine derivatives by conducting the quantum mechanical study. The infrared (IR) spectrum was measured through optimization and frequency calculation to compare it with experimental IR spectra, which con rmed the insertion of various functional groups. Furthermore, all the derivatives were employed for molecular docking against SARS-CoV-2 main protease protein (PDB: 6LU7) to understand their nonbonding interactions, binding mode, and binding a nity and to predict their antiviral properties. Pharmacokinetic properties were investigated to compare their absorption, lipophilicity, and solubility, and a radar map was utilized to understand their biological acceptance.

Materials and Methods
To identify drug interactions with receptor proteins, molecular docking is the optimum tool. In the blind docking method, the overall surface of the protein molecule was thoroughly analyzed for binding sites. The following software tools were used in this study to predict antiviral properties: i) Gaussian 09, ii) AutoDock 4.2.6, iii) Swiss-Pdb 4.1.0, iv) Python 3.8.2, v) Discovery Studio 4.1, vi) PyMOL 2.3, vii) http://crdd.osdd.net/servers/avcpred. Moreover, admetSAR server (http://lmmd.ecust.edu.cn/admetsar2/about), and SwissADME free web tools (http://www.swissadme.ch) were employed to calculate the pharmacokinetic properties.

Antiviral Activity Determination
Antiviral molecules (AVMs) present a category of antimicrobial drugs used to treat viral infections by inhibiting the growth of viral pathogens inside the host cells. For antiviral activity calculation, we used online software (http://crdd.osdd.net/servers/avcpred), which showed the inhibitory percentage. The SD (sampled data) le format of the cytidine derivatives was entered as input for predications. The derivatives were assessed for the development of antiviral therapeutics and suggesting the optimal inhibitory cytidine derivatives for further studies.

Chemical Reactivity and IR Optimization
In a computer-based drug design, thermal, molecular orbital, and molecular electrostatic features are widely calculated using the quantum mechanical method [35]. The geometrical calculation and subsequent alteration of all the cytidine derivatives were conducted using Gaussian 09 program [36]. The IR frequency and spectral properties of the cytidine derivatives were optimized and calculated employing density functional theory (DFT)-based force eld with Beck's (B) three-feature hybrid model and Lee, Yang, and Parr's correlation functional by using basis set 3-21G [37, 38].

Pharmacokinetic Prediction
In drug development, ADMET (absorption, distribution, metabolism, excretion, and toxicity) property prediction is crucial to prevent drug failure in clinical stages. Thus, the designed derivatives were evaluated for their in silico pharmacokinetics parameters to prevent their failure during clinical trials and improve their candidacy as potential candidate drugs. Online server admetSAR was employed to calculate the pharmacokinetic properties of the designed cytidine derivatives and parent compounds. We used the online database, admetSAR, to evaluate the pharmacokinetics pro le involved in the drug lipophilicity, toxicity, and absorption of cytidine and its selected analogs [39]. By using the structural resemblance exploration methodology, admetSAR predicted the latest and most widespread, manually curated results for several chemicals related to the studied ADMET pro les. For ADMET calculation, admetSAR was employed, in which 96,000 sole compounds (including 45 types of ADMET-related parameters), proteins, species, and organisms are diligently curated from various studies.

Protein Selection and Molecular Docking
The 3D crystal format of SARS-CoV-2 main protease protein (pdb: 6LU7) was recuperated in pdb from the protein data bank [40]. PyMol (version 1.3) software packages were used to eliminate all the heteroatoms and water molecules [41]. Protein energy was minimized using Swiss-PdbViewer (version 4.1.0) [42]. Furthermore, a molecular docking study was conducted against the SARS-CoV-2 main protease protein 6LU7 (Fig. 1) for the optimized drugs. Finally, PyRx application (version 0.8) was used for molecular docking simulations [43], which envisaged the target protein and cytidine derivatives as a macromolecule and ligand, respectively. The protein and ligands were entered as input by converting the pdb format to pdbqt format by using AutoDock tools of the MGL software package. In AutoDockVina, the grid box size was maintained at 51.3565, 66.9335, and 59.6050 Å along the X-, Y-, and Z-axis. After docking, both the macromolecule and ligand structures were saved in the pdbqt format. and Accelrys Discovery Studio (version 4.1) was employed to analyze the docking results and predict the nonbonding interactions among the cytidine derivatives and amino acid chains of receptor proteins [44]. PROCHECK online server was used for validation 92.06 overall quality factor was obtained in ERRAT (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi? cmd=Retrieve&db=PubMed&list_uids=8401235&dopt=Abstract), and 93.10% score was acquired in VERIFY 3D (https://www.ncbi.nlm.nih.gov/pubmed/1853201?dopt=Abstract). PDBsum online server was used to validate the main protease receptor with the Ramachandran plot and ligplot ( Fig. 2), which indicated 90.6% residue in the allowed region, and no residues were missed.

Results And Discussion
In this study, 14 cytidine esters were modi ed with different aliphatic and aromatic chains (2-15) ( Table 1) and were geometrically optimized to realize the modes of their antimicrobial behavior. Initially, partial acylated derivatives were selected for antiviral activities using the online web tool. Subsequently, the observed activities were rationalized by measuring the IR frequency, physicochemical properties, molecular docking, in silico pharmacokinetics, and drug-likeness properties. In nucleoside chemistry, the selective alteration of certain hydroxyl groups is important because the resulting acylation products might be useful precursors for the synthesis of new, bioactive products. Moreover, the designed acyl derivatives might exhibit a high antiviral e cacy as versatile intermediates for synthesizing various other antiviral drugs of fundamental importance. Table 1 and Fig. 3 present the atomic identi cation and structural variations of the substituted cytidine derivatives. Different aliphatic (pivaloyl, hexanoyl, octanoyl, decanoyl, laouroyl palmitoyl, myristoyl, and steroyl) and aromatic (4-chlorobenzoyl, cinnamoyl, 4-tert-butylbenzoyl, and trityl) groups were subjected to the hydroxyl (-OH) group modi cation of cytidine for investigating the variations in biological activities.

Computed and Experimental IR Spectrum for Characterization
The IR spectrum, which indicated the characteristic peaks for various functional groups, was calculated through optimization and frequency calculation by using Gaussian software 09 packet. In the modi ed cytidine derivatives, various aliphatic chains (pivaloyl, hexanoyl, octanoyl, decanoyl, laouroyl palmitoyl, myristoyl, and steroyl) and aromatic (4-chlorobenzoyl, cinnamoyl, 4tert-butylbenzoyl, and trityl) groups were introduced. In some functional groups, such as CH 3 , CO and NH 3 , where the C-C, C-H, C-N, C = O, and N-H stretching vibrations were observed (Fig. 4).
The experimental IR spectra of the cytidine derivatives (Fig. 5) displayed peaks at almost the same frequency as the computed frequency for all the functional groups. The IR spectra of derivatives (2 and 8) show the following absorption bands: 1731 and 1714 cm − 1 (due to -CO stretching), 3420 cm − 1 and 3416 cm − 1 (due to -OH stretching), and 3550 cm − 1 (due to -NH stretching). Furthermore, for derivatives (3)(4)(5) and (9-13) modi ed with different aliphatic chains (C5-C18), the IR spectra displayed absorption bands at 1729 and 1716 cm − 1 for C = O starching and 3470 cm − 1 for -NH stretching. Because no -OH group was present in these derivatives, the peak for OH was absent in their spectra. Moreover, derivatives (6, 7, 14, and 15), which comprised aromatic substituents, displayed absorption bands at 1726 and 3470 cm − 1 corresponding to C = O and -NH stretching vibrations. Both the experimental and predicted IR analyses con rmed the insertion of different aliphatic and aromatic substituents in the cytidine structure.

MEP
In the computer-aided drug design, atomic charges are employed to investigate the connectivity between the structure and biological activity of drugs. MEP is globally used as a reactivity map displaying the most suitable regions for the electrophilic and nucleophilic attacks of charged-point-like reagents on organic molecules [46]. MEP helps interpret the biological recognition process and hydrogen bonding interactions [47]. The counter map of MEP provides a simple approach to predict how different geometries can interact. The MEP of the title compound was obtained based on B3LYP with the basis set 3-21G-optimized results (Fig. 6). MEP is important because it simultaneously displays the molecular size and shape and positive, negative, and neutral electrostatic potential regions for color grading and is useful for studying molecular structures with the physicochemical property relationship [48]. MEP was calculated to determined the reactive sites for the electrophilic and nucleophilic attacks of the optimized structure of cytidine derivatives (7, 8, and 10). The red, blue, and green colors represent the maximum negative area favorable for electrophilic attacks, maximum positive area favorable for nucleophilic attacks, and zero potential areas, respectively.

Molecular Docking Simulation
In structural biology and the computer aided drug design, molecular docking is an important computational technique. The key aim of molecular docking is to determine the potential binding geometries of a putative ligand of a known 3D structure with a target protein. In this study, several cytidine derivatives were studied in silico to determine their possible binding energies and interaction modes with the active sites of SARS-CoV-2 M pro (Table 4) by using AutoDock Vina software. Table 3 presents the estimated binding energies of the binding site of the 6LU7 enzyme (Fig. 7) structure for all the studied compounds. According to the docking screening results, eight derivatives (6-10 and 13-15) with the strongest binding energies were selected to describe the binding mode of cytidine inhibitors. Comparatively, the aromatic derivatives exhibited better binding scores than the aliphatic derivatives. Figure 8 illustrates the interactions between the inhibitor and bordering residues of SARS-CoV-2 M pro in 2D schematics acquired by importing docking results into the Discovery Studio Visualizer. These interactions showed that the amino acids participated in interactions between the ligand and enzyme with an important contribution to the total interaction energy. Most interactions included hydrophobic contacts, Van der Waals interactions, hydrogen bonding, electrostatic interactions, carbonyl interactions, and a speci c atom-aromatic ring and provided insights to understand molecular recognition. Figure 9 presents the docked conformation of the most active molecules (8 and 14) based on the docking studies.
The results showed derivative (14) as the most promising ligand (− 9.2 kcal/mol) that bound with SARSCoV-2 M pro through hydrophobic bonding and many hydrogen interactions. The binding site is located in the hydrophobic cleft bordered with amino acid residues HIS41, ILE249, PHE294, VA104, CYS145, HIS246, and VAL297. Four hydrogen bond contacts occur with four different amino acids, ASN151, ILE152, and GLN110 at distances of 2.526, 2.814, 2.417, and 2.282 Å, respectively. Compound (14) exhibited an additional benzene ring in cytidine, providing a high density of electrons in the molecule and the highest binding score. These results indicated that modi cation of the -OH group along with long carbon chains/aromatic ring molecules led to an increase in the binding a nity, and the addition of hetero groups such as Br caused some uctuations in binding a nities; however, modi cation with halogenated aromatic rings led to an increase in the binding a nity. The docked pose showed that the drug molecules bind within the active site of the SARS-CoV-2 M pro macromolecular structure.  Parent molecule cytidine (1) interacted with the key residues of main protease CYS145 and HIS163 through hydrogen bonding within a close bond distance (2.173Å). Additionally, GLY143, SER144, and LEU141 interactions were observed, and interaction with SER144 showed a shorter bond distance (2.277Å) due to the unique interaction of the branched alkyl chain with the cytosine base. Acyl-chain-substituted derivatives (3)(4)(5) and (11 and 12) revealed low binding scores with the main protease, indicating the burying of the ligand in the receptor cavity. Although these derivatives exhibited low binding a nity, they interacted with the catalytic binding of the main protease, such as TYR154, HIS41, HIS163, HIS164, PHE294, GLN110, ASN238, GLU166, SER158, ILE152, THR199, and GLY143. These derivatives exhibited diverse nonbonding interactions, such as pi-anion, pi-donor hydrogen bond, amide pi-stacked, pipi stacked, and pi-pi T-shaped interactions, with the active sites of the main protease.
The aromatic substituents led to an increase in the binding energies in derivatives; 7-10 = − 7.4, − 7.4, − 7.4, and − 7.0 kcal/mol, respectively, and 13-15 = − 8, − 6.2, and − 9.2 kcal/mol, respectively. These derivatives interacted with the similar binding sites of the main protease, and PHE294, THR111, GLN110, ILE249, LEU287, and ASN151 were the most common residues for them. Amongst all the proteases, GLN 110 exhibited the minimum bond distance of 2.116 and 2.068 Å. These results revealed that due to the high electron density, aromatic substituents can easily lead to an increase in the binding and antiviral abilities of the cytidine derivatives. Along with PHE294, all the derivatives showed the maximum π-π interactions with ILE249, indicating strong binding with the active site. PHE294 is considered the principal component of PPS, APS, and PPT, which is responsible for the accessibility of small molecules to the active site. Binding energies and binding modes were improved for derivatives (7-10 and 13-15) due to signi cant hydrogen bonding. The alterations of the -OH group in thymidine exalted the π-π interactions with the amino acid chain at the binding site, and their polarity improvement resulted in hydrogen bond formation. The maximum numbers of H-bonds were observed in derivative (10), with ASN151, THR111, and GLN110 residues.
Ten commercial medicines possibly form H-bonds with the key residues of the 2019-nCoV main protease [49]. H bonds executed a vital role in shaping the speci city of ligand binding with receptors, drug design in chemical and biological processes, molecular recognition, and biological activity. Figure 10 presents the H-bond surface and hydrophobic surface of derivative (10) with both the proteins. The blind docking study of all the cytidine derivatives with the SARS-CoV-2 protease revealed that the molecules were generally surrounded by the aforementioned residues, which is similar to the arrangement in standard drugs. This nding suggested that this molecule may prevent the viral replication of SARS-CoV-2. Table 4 presents the bond distance for the ligands and the changes in the accessible area of the two important catalytic residues (Cys145 and His41) within the active site of the protease. The blind docking results revealed that all the molecules can act as potential agents for COVID treatment; however, the estimated free energies of binding values indicated derivative (14), with the highest negative minimum binding energy value of − 9.2 kcal/mol, as the optimum possible SARS-CoV-2 inhibitor among all the studied derivatives. Most selected cytidine derivatives exhibited promising activities and may use to develop effective antiviral drugs against SARS-CoV-2.

Biological explication
The inhibition capacity of cytidine-like nucleosides against SARS-CoV-2 M pro has been investigated in vitro [33][34]. Because SARS-CoV and SARS-CoV-2 viruses are highly similar, we investigated the in silico behavior of the cytidine derivatives toward SARS-CoV-2 M pro . Selecting this protein as the target led to considerable advances in antiviral treatment because it participates in the proteolytic processing of polyproteins replication. Consequently, it plays a key role in the expression and replication of viral genes. Therefore, the inhibition of this enzyme hampered the replication of the viral genome and multiplication of SARS-CoV-2. Nucleoside derivatives that can inhibit SARS-CoV 3CL pro may inhibit SARS-CoV-2 M pro in the same manner due to their high-sequence identity.
Pharmacokinetic pro le and molecular radar To predict the pharmacokinetic properties, such as solubility, lipophlicity, and toxicity of the compounds, we used the pkCSM ADMET descriptor algorithm protocol. Drug absorption depends on various factors, including membrane permeability [indicated by the cell line of colon cancer (Caco-2), intestinal absorption, skin permeability thresholds, substrate, and P-glycoprotein inhibitors.  (Table 5). Therefore, all the presented derivatives exhibit high skin penetrability.
In the pkCSM predictive model, high Caco-2 permeability is translated into the predicted log Papp values > 0.90 cm/s. 7 The value of Caco-2 permeability (log Papp) of the cytidine derivatives ranges from − 4.3 to − 2.4 cm/s, log Papp < 0.9 cm/s (Table 6); thus, these derivatives exhibit a low Caco-2 permeability. Molecular radar is a crucial QSAR factor exhibiting the molecular volume of compounds. Figure 11 illustrates the physicochemical radar of all the cytidine derivatives and reveals the promising QSAR features of the designed compounds. To discover oral administrative drugs, solubility is a major descriptor. High water solubility is useful to deliver active ingredients in a su cient quantity with small volumes of pharmaceutical dosage. These water solubility values are presented as log (mol/l) (insoluble ≤ − 10 < poorly soluble < − 6 < moderately soluble < − 4 < soluble < − 2 < very soluble < 0 ≤ highly soluble). The tested compounds are soluble (Table 6).

Conclusions
We conducted a computational study to identify the new inhibitors of anti-SARS-CoV-2; molecular docking was studied for a series of nucleoside (cytidine) derivatives, known as anti-SARS-CoV-2 agents. All the cytidine derivatives were successfully analyzed in silico for their antiviral activity prediction, IR characterization, MEP calculation, molecular docking, and pharmacokinetics properties. The insertion of various aliphatic and aromatic groups in the cytidine structure can considerably improve their biological and antiviral activity modes. The experimental and computed IR peaks con rmed the presence of various aliphatic chains and aromatic groups in the cytidine structure. Antiviral prediction indicated that aliphatic (2-4) and aromatic (8, 11, 14, and 15) derivatives exhibit potential antiviral modes. These ndings were rationalized through molecular docking, which revealed an excellent antiviral e cacy of the cytidine derivatives. Many derivatives showed outstanding binding energy and binding interactions with SARS-CoV-2 M pro . Eight cytidine derivatives (6-10 and 13-15) exhibit in silico potent ability to inhibit SARS-CoV-2. Pharmacokinetic prediction provided the promising results for in silico properties, revealing that all the modi ed compounds exhibit an improved pharmacokinetic pro le. Future in vitro and in vivo studies should determine whether these compounds can be drugs used to treat SARS-CoV-2. Figure 1 Crystal structure of the SARS CoV-2 main protease protein (pdb: 6LU7) with the binding pocket  Computing IR spectrum from Gaussian base on DFT-B3LYP/3-21G