Discovery of Potent Sialic Acid Inhibitors Against the Hemagglutinin of Influenza A Virus


 Influenza A virus (IAV) is hard to eradicate due to its genetic drift and reassortment ability. Neuraminidase (NA), frequently the target protein, cleaves the sialic acid (SA) and discharges virions to complete the infectious cycle. However, the increasing use of NA inhibitors aroused drug resistance in recent years. Hemagglutinin (HA), on the opposite, initiates the infectious cycle and sticks virions to cells by connecting to the host SA so that HA might be a tempting target. In this study, HA was chosen for SA-binding site model preparation to screen more than 200,000 compounds by molecular docking method in silico. According to the post-screening analysis by iGemdock and SiMMap, nine of the top twenty compounds based on the ranking score were purchased and evaluated. NSC85561 was initially identified from the compounds through the bioassays. Next, the twelve derivatives of NSC85561 were selected to consolidate the primary results. NSC85561 and two of its derivatives were finally identified as potent HA inhibitors by cell proliferation, plaque reduction, and hemagglutination inhibition assays in vitro. These results suggested that virtual screening may be a powerful tool to concise the compounds from the massive database and reduce the complicated bioassays.


Introduction
Influenza viruses have posed a severe threat to human health, causing up to 650,000 deaths and economic loss annually. Since the 2009 H1N1 flu pandemics, the 2018 flu season has recently been the worst catastrophe. Among the most deaths are the people age 65 or older in industrialized countries 1 .
Another study shows that children younger than five years attribute to influenza deaths, with 99% occurring in developing countries 2 . Over the years, seasonal vaccines are predicted to protect humans from infection but frequently failed because of influenza's genetic drift and reassortment ability. Yearly vaccination is the primary means of preventing and controlling influenza 3 , but it is inadequate to apply every scenario. There are three FDA-approved influenza antiviral drugs recommended by CDC for use against recently circulating influenza viruses. Rapivab (peramivir), Relenza (zanamivir), and Tamiflu (oseltamivir), which target neuraminidase (NA) as inhibitors, interfere with the release of progeny influenza virus from infected host cells. This process prevents the infection of new host cells and halts infection spread in the respiratory tract 4 . However, those NA inhibitors, which seem to be powerful agents to treat the influenza virus, recently arose some sporadic drug-resistance events. In Japan, oseltamivir-resistant H1N1 viruses were isolated from 7 of 43 patients (16%) 5 . About 27% of patients infected with H1N1 viruses showed resistance to oseltamivir in the UK 6 . Owing to the urgent situation that new targets must be explored, so we switch our direction to the hemagglutinin (HA).
HA plays a pivotal role during the Influenza A virus (IAV) life cycle and possesses a unique structure.
HA takes the role in the attachment and penetration to the host cells. Nevertheless, there are no HA inhibitor drugs offered to clinical patients. The main reason is that the variety of HA maybe even is strengthened by antigenic drift and shift 7, 8 . A report showed that HA globular head glycosylation might be an essential step for viruses to gain virulence and antigenic properties 9 . HA's specific structure is a homotrimer whose globular domain contains a receptor-binding site (RBS) specific for sialic acid (SA) expressed by the host cell glycoproteins and glycolipids, which terminates host glycans 10 . When the RBS of influenza binds to the host SA, the viruses can quickly enter the host cells. To prevent IAV from obtaining entry to the host cells, the RBS might be a potential target.
The homotrimer is constructed by the un-cleaved HA0 subunits, which are unable to fuse the host membranes. HA0 inhibitors interrupt the correct folding of HAs in progeny influenza viruses; therefore, they elicit nonfunctional HA conformation to block viral entry 11 . One example is AF4H1K1, which could block the immature HA0 cleavage and inhibit the infectivity of IAV 12 . The HA1 folds into a jellyroll motif of eight stranded antiparallel β-sheets and a shallow pocket at the distal tip, acting as an RBS surrounded by antigenic sites 13 . The mechanism of HA1 inhibitors mainly lies in blocking receptor binding so that it prevents virus infective initiation. The scientists found that a specific plant from Borneo was traditionally to treat symptoms of influenza infection. They extracted the plant and discovered that the useful substances were not limited to SA-like compounds but non-SA-like components that might act against other viral proteins apart from HA and NA 14 . Besides, the synthetic SA-mimic peptides showed a high affinity to the RBS to block the initial infection in another study 15 .
The HA2 mediates the host endosomal membrane's fusion with the viral membrane, allowing viral ribonucleoprotein entry into the host cell 16 . In the process of HA-mediated fusion, metastable HA2 subunits undergo irreversible rearrangement at acidic pH, which causes membrane fusion and the completion of viral entry 11 . One HA2 inhibitor is Arbidol (ARB, Umifenovir), which interacts with HA2 directly, and can prevent acidification-induced conformation change of HA2 and impeding fusion in endosomes 17 .
From the overview of the IAV structure, our study focused on the RBS of the HA1 subunit. The compounds were from the NCI database to identify potential inhibitors. Over 200,000 compounds were screened by the molecular docking method. Next, the docked compounds were used to recognize interaction preference by analyzing the RBS with interacting residues and specific physical-chemical properties. With these models, we could identify inhibitors with novel scaffolds. Some HA inhibitors were identified and validated by a series of bioassays, including cell proliferation, plaque reduction, and hemagglutination inhibition. To refine our analysis and find more compounds targeting HA1, we selected the initial compound's derivatives to enter the same experimental tests. Some attractive and tempting outcomes were finally unveiled. This study provided the comprehensive framework of efficient screening of HA1 inhibitor leading compounds for further drug design and development. Figure 1 gives an overview of anchors formation and post-screening analysis for the identification of novel HA inhibitors. First, the structure of the receptor-binding site (RBS) of the HA1 subunit (PDB ID: 1RUY) 18 was selected as the target protein, and 208,023 compounds from NCI were collected as the screening compound database (Fig. 1a). The molecular docking and post-screening analysis were performed by iGEMDOCK 19 (Fig. 1b). Next, SiMMap 20 anchors were constructed based on the docking score and the interaction types between top-ranked 1,000 compounds and the RBS (Fig. 1c). An anchor is composed of a binding pocket with corresponding interacting residues, moiety preference, and type of the interaction (E: electrostatic, H: hydrogen-bonding, or V: van der Waal forces). These 1,000 compounds were re-ranked based on the SiMMap score, and the top twenty compounds were selected as potential candidates (shown in Supplementary Table S1 online). Finally, nine compounds were purchased and evaluated by cell proliferation, plaque reduction, and hemagglutination inhibition assays in vitro (Fig. 1d).

Anchors of the Receptor-Binding Site
Three pivotal anchors, including E1, H1, and V1, (Fig. 2b) were generated by SiMMap in the RBS.
The E1 anchor has the preferred interaction with sulfonic acid and carboxylic acid moieties, interacting with one basic residue, R224 lies in the 220-loop. Consequently, the compounds which have a nucleophilic tendency have the affinity to bind in the E1 anchor. The H1 anchor interacts with one acidic residue D90, a small residue G92, and one nucleophilic residue T93 located in the 90-loop. The H1 prefers to bind to amid, ketone, amine, and hydroxyl groups (the preferences decrease in this order).
Last, the V1 anchor has a high tendency to bind to an aromatic and heterocyclic moiety, interacts with one amide residue N91, and one basic R224, the same as the E1 anchor. Both E1 and V1 have R224 because the weak non-bond force between guanidinium in R224 and aromatic moiety usually form planar stacking in nature, which will not interfere with the electrostatic force between R224 and acid moieties 21 . The corresponding 3D model is shown in figure 2a.

Derivative Compounds from NSC85561
Based on the SiMMap score, the nine compounds were purchased and performed the bioassays (Table   S1). The MTS method was used to examine the safety of the compounds. The 100 μM compounds were taken to measure the individual OD values and insight into the MDCK cells' toxicity ( Fig. 3a).
NSC97307 was excluded due to its relative cell viability was lower than 0.75. Then, the 25 μM compounds were taken to the plaque reduction assay. The values of PFU (% control) treated with NSC85561 and NSC84472 were dramatically decreasing (5±5 and 40±20), so NSC85561 and NSC84472 were unveiled ( Fig. 3b and Table 1). Next, the HA assay was used to validate the efficacy of NSC85561 and NSC84472. The compounds started with 100 μM by two-fold serial dilution in the HA experiments. NSC85561 formed precipitation at 12.5 μM, but NSC84472 was at 100 μM (Fig. 3c).
The efficacy of NSC85561 was much higher compared with NSC84472, so we took NSC85561 as the template for generating the twelve derivative compounds according to the atom-pair fingerprint among our compound dataset. The SMILES code of these twelve derivative compounds and NSC85561 are shown in table S2.

The post-screening analysis meets the bioassay outcome
The post-screening analysis from the iGEMDOCK showed a similar outcome versus the HA assay. Figure 4A presents the hierarchical clustering results of thirteen compounds according to their interaction profile. From NSC85561 down to NSC73413, similar patterns were found, and earlier clustered (Fig. 4a). In the HA experiments, compounds started with 25 μM by two-fold serial dilution.
NSC4203 formed precipitation at 25 μM, whereas NSC85561 and NSC73413 at 12.5 μM. Also, NSC47715 and NSC7223 at 6.25 μM stood out from the thirteen compounds (Fig. 4b). We picked out these five compounds based on the outcome of the interaction profile and HA assay. Next, to validate these five compounds' efficacy, we used virus plaque reduction assay in different concentrations. The MDCK cells were seeded in 6-well plates for 24 hours, mixed with or without the five compounds at 25 μM or 12.5 μM (Fig. 5a-5b). When our five compounds were at 12.5 μM, the efficacy of NSC47715, NSC85561, and NSC7223 was much higher than NSC4203 and NSC73413. Figure 5d). NSC47715 showed the lowest values, which means that NSC47715 could inhibit IAV at the lowest concentration among the three compounds. SI value, the ratio of the CC50 and EC50, was used to evaluate safety and toxicity simultaneously. The summary table is showed in table 2.

Discussion
In our study, HA was the target to block IAV infection, which was different from the recent research focusing on neuraminidase (NA) [22][23][24] . However, the NA-drug resistance events happened continuously, so NA might not be a reliable target to quench IAV. From a 2019 research, the authors emphasized the vulnerable HA stalk trimer domain and depicted the conserved residues over various IAV strains 25 .
Besides, they found that FluA-20 (human antibody) tends to bind between the 220-loop and the 90loop. Our models and predictions were consistent with their findings, so the anchors might play an essential role in interfering with IAV's fusion.
The interaction profile based on the post-screening analysis presents some intriguing outcomes ( fig.   4a). Although six compounds from NSC85561 down to NSC73413 showed a similar pattern, NSC87862 did not demonstrate the HA assay's corresponding efficacy ( fig. 4b). NSC87862 shows a lower binding affinity toward the side chain of N91 's hydrogen-bonding force and Y102's van der Waal force compared with the other five compounds. Therefore, the 2D interaction diagrams of six compounds with the pocket environment were examined by the PROTEINS PLUS server 26 (fig. 6a-6c and S. 1a-1c). All compounds have two naphthalene rings except NSC87862, which might be the critical reason for weak binding affinity. Also, NSC4203 and NSC73413 possess weak efficacy toward IAVs that might lie in their orientation of one naphthalene between residue T93 and G92 ( fig. 6d-6f and S. 1d-1f). The interface view of RBS with docked NSC47715, NSC85561, and NSC7223 was shown in Figure 6g-6i. The most significant difference is the additional nitrite in NSC85561 and NSC7223 compared with NSC47715, and it might interfere with their binding to the HA. Consequently, virtual screening and bio-assay methods complement each other to see more clearly and more objective about identifying novel inhibitors.

Conclusions
Our research sheds light on the utility and feasibility of virtual screening with the outcome of SiMMap for identifying lead compounds. NSC85561 showed the highest affinity to the RBS initially. The same experimental assays were conducted on derivatives of NSC85561, and three compounds, NSC85561, NSC7223, and NSC47715 in total, were finally identified as the potential inhibitors of HA. These strategies reveal that SiMMap enables us to search for desired functional groups and analyze the connection between the RBS and the compounds. To sum up, we considered that the virtual screening methods possessed the potential to eliminate the threat of IAV shortly.

Dataset Preparation and Virtual Screening
The structure of H1N1 (PDB ID: 1RUY) was selected as the target protein for virtual screening from Protein Data Bank 27 . It belongs to the H1 class from the 1930 swine influenza strain. The method to determine the structure of 1RUY is through X-ray diffraction with a resolution of 2.7Å. The compounds employed for virtual screening were assembled from the NCI database 28 . The 208,023 compounds were finally filtered out according to Lipinski's rules of five. The iGEMDOCK software 19 , a widely used and accurately predicted molecular docking method, was used to dock the compounds into the RBS of the HA1 subunit. Besides, the iGEMDOCK consists of three types of interactions: electrostatic, hydrogen-bonding, or van der Waals forces. Based on the interactions (docking energy), which consist of a simple empirical scoring function and a pharmacophore-based scoring function, the top 1,000 compounds were selected.

SiMMap Construction and Hemagglutinin Inhibitors Identification
The top 1,000 compounds ranked by docking energy with their binding poses were submitted to the SiMMap server to reevaluate 20 . The SiMMap was used to analyze a protein binding site to identify anchors (antigenic sites), often critical binding environments 24 . The moiety compositions between these compounds and the SiMMap anchors provide clues for lead optimization. Based on the proteincompound interaction profiles generated by the SiMMap, consensus interactions between compound moieties and binding site composed of consensus interacting residues were identified. Next, a sitemoiety map score for each chosen compound was generated. The identified compounds were reordered with the scores. Finally, the promising HA1 inhibitors were picked out for further rigorous examination based on their ranking, feasibility, physicochemical property, and interaction type.

Cell Proliferation Assay (MTS)
To test the cells' corresponding viability and cytotoxic compounds' analysis, we conducted a cell proliferation assay (MTS) 29 , a colorimetric method for sensitive quantification. Based on the reduction of MTS tetrazolium compound by viable cells to generate a colored formazan product, the formazan dye produced by viable cells can be quantified by measuring the absorbance at 490 nm. MDCK cells (1 x 10 4 cells/well) are seeded into 96-well microtiter plates overnight. Next, the MDCK cells were mixed with our potential compounds in flu-medium for 72 hours at 37°C with 5% CO2. The cells were washed with phosphate buffer saline (PBS; Na2HPO4: 8mM, NaCl: 137 mM, KCl: 2.68mM, KH2PO4: 1.47 mM adjusted to pH7.2) twice after removing the previous medium in the wells. 10μL MTS solutions and 90μL DMEM were added to each well, and the cells were incubated for an additional half an hour. Finally, absorbance at 490 nm was recorded using a microplate reader (SpectraMax iD3, Molecular Devices, USA). A blank control (flu-medium only) and cell control (without compounds defined as 100% cell survival) were included in every assay plate. The mean optical density (OD, absorbance) of the three wells in the indicated groups was used to calculate the percentage of cell viability as follows: Percentage of cell viability = (Atreatment − Ablank)/(Acontrol − Ablank) × 100% (A: absorbance) 30 . Furthermore, the cytotoxic concentration was required for the identified compounds to reduce cell viability by 50% (CC50) was determined.

Plaque Assay
To determine viral titer as plaque-forming units per ml (PFU/ml), we selected plaque assay 31,32 , which is a standard method to quantify the virus. In brief, MDCK cells (1 × 10 6 cells/well) were seeded in 6well plates for 24 hours at 37°C with 5 % CO2. The single layer of cells was infected with the indicated dose of IAV, followed by washing with PBS three times. Next, the cells were covered with 0.3% agarose in the flu-medium for an additional two days at 37°C. Cells were fixed with 10% formaldehyde and then stained with 1% crystal violet. Finally, the PFU was calculated.

Plaque Reduction Assay
To quantify the antiviral activity of the compounds versus IAV, we conducted the virus plaque reduction assay. As described above, the previous seeded monolayer cells (100 PFU/well) were incubated with or without the selected compounds. The virus suspension was removed after the IAV adsorbed the compounds for an hour. The cells were then immediately washed trice with PBS and overlaid with 0.3% agarose in the flu-medium again present or absent the compounds for additional 48 hours. The 50% effective concentration (EC50) of the compounds was determined by comparing the number of plaques in the virus-infected control. Furthermore, the EC50 was defined as half maximal inhibitory concentration, and then the selectivity index (SI) was determined by the ratio: CC50/EC50.

Hemagglutination Inhibition Assay
Our experiment observed an erythrocyte agglutination test whether sialic acid was bound to a compound and whether the filtered compounds inhibited hemagglutination from causing a precipitation reaction in the sialic acid-binding and membrane fusion of HA, respectively. If there is agglutination, we can measure the lowest concentration of the virus. First, we diluted our compounds with PBS to one-thousandth. MDCK cells were also diluted in a serial twofold to gain the titer of our IAV. To reduce the interference of the antibodies in erythrocytes, we prepared human type-O serum. Next, the sera were diluted in a serial twofold in a 96-well microtiter plate.

Data availability
The authors confirm that the data supporting the findings of this study are available within the article and its supplementary materials.   Hemagglutination inhibition assay: to measure influenza-specific levels in blood serum.