Identification of Novel Small Molecule Inhibitors Against nsP2 Protease of CHIKV through a Molecular Modeling Approach

Chikungunya is a tropical viral disease spread by the female Aedes mosquitoes infected with the Chikungunya virus (CHIKV). Non-structural protein 2 (nsP2) plays a crucial role in the viral life cycle by its proteolytic activity and hence it is one of the most important drug targets. There is currently no permanent treatment available to tackle the infection. In this molecular modelingbased study, a combination of de novo ligand design, molecular docking, and ADMET-based screening is employed to identify novel inhibitor molecules targeting the active site of nsP2 protease of the CHIKV. A set of molecules have been shortlisted as potential inhibitors based on their binding affinity and drug-likeness score. Further experimental validation is required to verify the potency of the proposed leads against CHIKV nsP2 protease activity to combat the infection.


Introduction
Chikungunya is a viral infectious disease spread by the bites of infected female Aedes mosquitoes. It produces fever, joint aches and swelling, muscle pain, muscle pain, nausea, headache, and rash. It was initially reported in Southern Tanzania, and it has now been detected in over forty countries worldwide [1]- [3]. It is an RNA virus, belongs to the Togaviridae family's alphavirus genus. The disease is mostly transmitted by two mosquito species known as Aedes aegyptiand Aedes albopictus [4]. The word "chikungunya" is originated from a phrase in the Makonde language that means "that which bends up," and refers to the hunched posture of people suffering from joint discomfort (arthralgia) [1].
The virus's genome is translated into nine proteins, five of which are structural (one capsid protein (C), two envelope glycoproteins (E1, E2), and two peptides, E3 and 6K) and four of which are non-structural (nsP1, nsP2, nsP3, and nsP4) [5], [6]. Among the structural proteins, capsid protein initiates the viral life cycle within the host cell via autoproteolysis, whilst envelope glycoproteins aid in the viral entrance [7]. nsP1 and nsP3 synthesize the negative sense of the RNA strand, while nsP4 controls virion polymerization in the host cell. The multifunctional protein nsP2 is a member of the papainsuperfamily of cysteine proteases, which is well recognized for its proteolytic activity and involvement in viral replication. Therefore, nsP2 is one of the most promising drug targets for developing drugs against the infection [8]. Several computational and experimental studies have been conducted in search of inhibitors of nsP2 through different routes such as molecular modeling, structure-based drug design, molecular docking, and MD simulation, pharmacophore mapping, biological evaluation, and many more [9]- [17] Deep learning contributes to the field of drug discovery in multiple ways from searching potential binding pockets on the surface of the protein [18]- [20] to generating novel molecules based on the protein pocket information [21]- [24]. LiGANN uses deep learning-based generative adversarial networks (GANs) to generate ligands taking the protein structure information [21].
In this in silico study, the crystal structure of CHIKV nsP2 protease was obtained from protein databank [25] and, based on the active site information, novel lead compounds were created using LiGANN, a de novo drug design tool based on generative neural network (GNN) [21]. All of the suggested molecules are then manually docked to the nsP2 active site using PyRx, an autodock-embedded software [26], [27]. The highest scoring ligands' ADMET characteristics were evaluated using the SwissADME webserver [28]. Further studies are needed to investigate the efficiency of the suggested compounds in treating chikungunya virus infection.

Materials and Methods
Protein structure retrieval: The crystal structure of the chikungunya virus nsP2 protease has been downloaded in .pdb format from the RCSB PDB website (PDB ID: 3TRK). The crystallographic water molecules and the ions are removed and the protein surface is visualized using PyMol [29]. The catalytic dyad, CYS 1013 and HID 1083 is highlighted.
de novo ligand design: The prepared protein pdb file has been uploaded as input in the LiGANN webserver (https://playmolecule.com/LiGANN/), number of ligand shape generations and decoding per shape has been set as 10 and 10 respectively. A grid box has been generated centering the catalytic dyad (CYS 1013 and HIS 1083) of the protein. Then the protein has been processed and converted into. pdbqt using PyRx "Make macromolecule" utility. A total of 88 molecules have been generated by LiGANN, using Open Babel, the SMILES ID of the compounds are converted into .sdf file format [30]. All the downloaded compounds have been energy minimized and converted into. pdbqt format before docking using PyRx. A grid box has been created around the catalyticdyad (CYS 1013 and HIS 1083) and docking has been performed using the vina wizard utility of PyRx.The 2D depiction of interactions of the protein-ligand complexes has been created using Discovery Studio visualizer [31].
ADMET prediction: ADMET stands for Absorption, Digestion, Metabolism, Excretion, and Toxicity, this is a measure of drug metabolism and pharmacokinetics (DMPK) which is crucial for drug discovery. ADMET and physicochemical properties, drug-likeness, synthetic accessibility of the top-scoring leads have been estimated using the SwissADME web server.

Results and Discussions
Protein structure retrieval: The crystal structure of the chikungunya virus nsP2 protease (3TRK) is downloaded in .pdb file format consists of 324 amino acids and has a resolution of 2.40 Å. The three-dimensional structure is visualized in cartoon representation; the catalytic dyads are shown in licorice representation (Figure 1).
de novo ligand design: The modified pdb file is given as input on the LiGANN server, which results in 88molecules based on the protein active site information. The molecules belong to different chemical classes so that there's a diverse group of ligands.  Table 1 and all the 2D protein-ligand interaction images are shown in Figure 2.
ADMET prediction: The SwissADME web-server was used to determine various physicochemical parameters, leadlikeness, and synthetic accessibility of the top 17 lead compounds, as shown in Table 2

Conclusions
Chikungunya is a widespread threat to public health. There is currently no viable medicine or vaccination that can totally cure the disease. Using a deep learning methodology, this investigation discovers several new leads. The lead compounds demonstrate significant efficacy against one of the virus's most important therapeutic targets, nsP2, in terms of PyRx binding affinity score, protein-ligand non-bonding interactions, and physicochemical features. Further experimental research is necessary to confirm their effectiveness and find viable treatments for the viral infection.  (2)