Human TMPRSS2 non-catalytic ectodomain and SARS-CoV-2 S2' subunit interaction mediated SARS-CoV-2 endocytosis: a model proposal with virtual screening for potential drug molecules to inhibit this interaction

Abstract This study proposes a novel model for integration of SARS-CoV-2 into host cell via endocytosis as a possible alternative to the prevailing direct fusion model. It is known that the SARS-CoV-2 spike protein undergoes proteolytic cleavage at S1-S2 cleavage site and the cleaved S2 domain is primed by the activated serine protease domain (SPD) of humanTMPRSS2 to become S2'. The activated SPD of TMPRSS2 is formed after it is cleaved by autocatalysis from the membrane bound non-catalytic ectodomain (hNECD) comprising of LDLRA CLASS-I repeat and a SRCR domain. It is known that the SRCR domains as well as LDLRA repeat harboring proteins mediate endocytosis of viruses and certain ligands. Based on this, we put forward a hypothesis that the exposed hNECD binds to the S2' as both are at an interaction proximity soon after S2 is processed by the SPD and this interaction may lead to the endocytosis of virus. Based on this hypothesis we have modelled the hNECD structure, followed by docking studies with the known 3D structure of S2'. The interaction interface of hNECD with S2’ was further used for virtual screening of FDA-approved drug molecules and Indian medicinal plant-based compounds. We also mapped the known mutations of concern and mutations of interest on interaction interface of S2’ and found that none of the known mutations map onto the interaction interface. This indicates that targeting the interaction between the hNECD of TMPRSS2 and S2’ may serve as an attractive therapeutic target. Communicated by Ramaswamy H. Sarma

Introduction COVID-19 caused by SARS-CoV-2, within a few months of its first report, quickly attained a pandemic status globally, leading to the unprecedented loss of many human lives. In addition, many countries have been severely impacted social-economically due to the impending lockdowns to contain the spread of the virus. With the recent emergence of the Omicron variant of SARS-CoV-2, there is a general expectation among the medical fraternity that some other variants of concern may emerge, some of which may perhaps be more contagious than the previous variants, with a potential to infect people, including those vaccinated as well as the ones having a previous history of COVID-19. Hence, there is a need to investigate the virus biology in the host much more deeply than what has been done already, and this can give rise to novel targets for drug design or for repurposing of known drugs to prevent or treat SARS-CoV-2 infections.
Hitherto, studies toward the development of antiviral compounds against COVID-19 have focused on some of the human as well as viral factors as targets as these have been found to play key roles during different stages of the life cycle of the virus . Among the targets, the host factor human TMPRSS2 has been shown to facilitate viral infection by means of two independently carried out functions: a) Priming of the viral Spike protein (S-protein) leading to the generation of the fusion peptide referred to as S2' (Heurich et al., 2014) and b) proteolytic cleavage of the ACE2 receptor after it binds to the cleaved S1 subunit of the viral Spike protein (Hoffmann et al., 2020).
HumanTMPRSS2 is a membrane-bound serine protease of 492 aa length, and it is composed of the cytosolic part (1-84), the transmembrane part (85-105), and the ectodomain (106-492). The ectodomain further is composed of the N-terminal part comprising of one 38 aa long Low Density Lipid Receptor-Class A (LDLRA) repeat (112-149) and one 93 aa long Scavenger Receptor Cysteine Rich (SRCR)  domain and the C-terminal part comprising of one 234 aa long Serine protease (SP) (255-492) domain ( Figure 1). Henceforth, for convenience, we refer to the N-terminal nonserine protease part of the ectodomain of human TMPRSS2 as hNECD. It has been shown that the human TMPRSS2 undergoes autocleavage in prostate cancer cells expressing TMPRSS2, by which the SP cleaves out of the whole protein leaving behind the membrane bound hNECD (Afar et al., 2001). Recently, a study has also asserted and confirmed the cleavage of the catalytic domain from the hNECD (Fraser et al., 2021).
SRCR domain of hNECD belongs to the superfamily of SRCR domains found only in eukaryotes. These domains are highly conserved and found in many cell receptors (Sarrias et al., 2004). SRCRs have been known to bind to different types of ligands, including LDLs, bacteria, and viruses, and even mediate their endocytosis (Kaksonen & Roux, 2018;Yap et al., 2015). SRCR domains have been reported to be essential for infection of pigs by porcine reproductive and respiratory syndrome virus (PRRSV). Gene knockout of a specific SRCR5 domain in pigs made them resistant to PRRSV1 virus (Burkard et al., 2018). The LDLRAs are also known to be part of proteins involved in the endocytic pathway (https://prosite.expasy.org/doc/PS01209). LDLR receptors have been reported to be the host cell attachment receptors for Human Rhinovirus 2 (HRV2), followed by their entry into host cells mediated by clathrin-mediated endocytosis (Fuchs & Blaas, 2012).
Using the above information as basis, we wish to put forward a hypothesis that the membrane bound exposed hNECD may bind to S2', and this Interaction may mediate the endocytosis of virus particles ( Figure 2). We further propose a model for this interaction and virtually screen a known set of approved drugs as well as a known set of phytochemicals that might disrupt the interaction of hNECD with S2'. This, we believe, might open a new paradigm of viral entry into the host cell and also might help in the development of a new treatment regime for COVID-19. Besides this, we observe that none of the known mutations of concern and mutations of interest map on to the interaction interface of the S2' with hNECD, and therefore, any drug development toward disruption of this interaction can potentially give rise to a variant-independent treatment regime.

Methodology
Modeling of hNECD 3D-structure Crystal structure of TMPRSS2 protein SRCR domain along with SP is available with PDB id 7MEQ, however in this structure the LDLRA domain is missing, and also the SRCR domain has 8 missing residues. Therefore, we used AlphaFold V2.1.0 (Jumper et al., 2021) google collab server (https://colab. research.google.com/github/deepmind/alphafold/blob/ main/ notebooks/AlphaFold.ipynb) to model the 3D structure of hNECD. We submitted the protein sequence of the hNECD region (from residues 112 to 243) covering LDLRA and SRCR domains for structure prediction and obtained five models. We selected the best model ( Figure 3) among the five predicted models and further validated its stereochemical quality and fold compatibility using PROCHECK (R. A. Laskowski et al., 1993) and Errat plot (Colovos & Yeates, 1993) respectively.
It is worthwhile to mention here that an AlphaFold predicted model for the complete TMPRSS2 protein is available (https://www.uniprot.org/uniprotkb/O15393/entry#structure). As per the PDB record this model was generated on July 1, 2021. We surmised that a better model may be obtained because of availability of improvements in the AlphaFold itself and hence we separately modelled hNECD as mentioned above. We compared the hNECD model extracted from the complete TMPRSS2 model (Early model) with our model and found an RMSD of 0.53 Å. Furthermore, our model was found to be slightly better than the Early model (Supplementary Figure S3).

SARS-CoV-2 spike protein
Many crystal structures of the spike protein of SARS-CoV-2 have been solved, and among them, many have gaps in the S2 region. We, therefore, chose a Cryo-EM structure representing the prefusion state (PDB id:6XR8)  as it has complete VS2' region. As mentioned in the introduction section, spike protein first undergoes cleavage at S1/S2 cleavage site (between residues Q675-I692) followed by priming of S2 to S2' (Figure 1). Since our hypothesis is, the S2' region (referred henceforth as VS2') binds to the hNECD and mediates endocytosis. Hence, we removed the amino acid residues till the second cleavage site (R815) in the Cryo-EM structure using PyMol2.3.4 (Schrodinger, LLC) and the remaining structure (which we henceforth refer to as VS2') from S816 to P1162 was used for further studies (Figure 3). The VS2' structure was examined for its stereochemical quality using PROCHECK (R. A. Laskowski et al., 1993), ERRAT (Colovos & Yeates, 1993) (Supplementary Figure S4-5) and Spike protein has two cleavage sites S1/S2 protease cleavage site (675-692) and S2' cleavage site (815-816), shown by arrows in the figure was also referred to the quality report given in PDB database. It was found that, VS2' structure is suitable for further studies involving protein-protein docking and MD simulation studies.

Protein-protein docking studies
We used Haddock2.2 (van Zundert et al., 2016) server (https://wenmr.science.uu.nl/haddock2.4/) for protein-protein docking studies. We used the modeled hNECD structure and the VS2' structure as input structures and performed a blind docking considering the entire structure of modeled hNECD and residues from S816-T883 and L948-M1029 in the VS2' structure for sampling. During docking HADDOCK was set to the default parameters. PPCheck protein-protein interaction tool (Sukhwal & Sowdhamini, 2013) was used to analyze the different docking poses for their binding energies, presence of salt bridges, and hydrogen bonds at the protein-protein interaction interface. The following cutoff distances were used for the identification of hydrophobic and salt-bridge interactions: (a) Hydrophobic interactions: C b -C b distances between a pair of amino acid residues 7 Å and (b) Saltbridges: Distances less than 4 Å between the side chain nitrogen and oxygen atoms of two oppositely charged residues.

MD simulation of hNECD and VS2' complex
MD simulation was used to ascertain the stability of the complex and was carried out using GROMACS version 2020.5 (Abraham et al., 2015;Lindahl et al., 2020). The OPLS-AA force field was used with the TIP3P water model (MacKerell et al., 1998) for solvation. The periodic boundary condition of 1.0 nm was used. The system was neutralized by adding 15 NA as counter ions. The hNECD-VS2' complex structure was subjected to the steepest descent minimization. The energy minimum structure so obtained was subjected to MD simulations first by NVT, followed by NPT protocols. We used a modified Berendsen thermostat (Berendsen et al., 1984) to  . 3D structure of hNECD modelled using AlphaFold and VS2' structure extracted from the Cryo-EM structure of the spike protein (PDB Id: 6XR8). These figures were produced using PyMol (Schrodinger, LLC). hNECD is composed of LDLRA and SRCR domains. VS2' is a trimeric structure (each chain has been shown in different colors). The surface view of hNECD and VS2' shows the distribution of surface electrostatic potential, and as the color legend shows at the bottom, the red color represents negative potential, the blue color represents positive potential, and the white color represents neutral surface potential. maintain a constant temperature and the Parrinello-Rahman barostat algorithm (Nos e & Klein, 1983;Parrinello & Rahman, 1981) to maintain constant pressure during the simulations. We carried out MD simulations by constraining the backbone of VS2' in hNECD-VS2' complex for 100 ns with 50000000 steps (integration step ¼ 2 fs) and saved the snapshots of instantaneous structures at every 5000 steps with a time gap of 10 ps.
We used the GROMACS command tools "gmx_rmsd" to analyze RMSD and "gmx_saltbr" to identify salt bridge interactions. VMD 1.9.3 (Humphrey et al., 1996) extension Hydrogens Bonds was used to calculate hydrogen bond interactions. We performed cluster analysis on the snapshots saved during the simulation using a gmx cluster with an rmsd cutoff of 0.2 nm. Among the clusters the one with the greatest number of snapshots was chosen as a representative of the stable structure and its centroid was used for further studies including virtual screening, analysis of the interaction interfaces etc. We also identified the hotspot residues, which are critical for the Interaction of hNECD with VS2' using the PPCheck hotspot prediction tool (Sukhwal & Sowdhamini, 2013).

Virtual screening studies
We choose the hNECD interaction surface for virtual screening using two different sets: (a) a set of US FDA-approved compounds (FDAC) and (b) a set of phytochemicals (PHTC). A set comprising 2186 FDACs was extracted from DRUGBANK (Wishart et al., 2006) in SDF format, and its library was formed using the Open Babel (O'Boyle et al., 2011) in PDBQT format. The second set of 478 druggable phytochemical compounds in PDBQT format was extracted from the Indian medicinal plants reported in the IMPPAT database (Mohanraj et al., 2018). Autodock tools (Morris & Huey, 2009) were used to prepare hNECD for virtual screening. The search grid box size was chosen such that it covers the entire interaction surface of hNECD. The size of the grid box was (60 Â 60 Â 60) Å 3 and the grid center points were (X=-32.84, Y ¼ 20.313, Z=-12.862). A shell script provided by Vina authors was used to run the Autodock Vina (Trott & Olson, 2010) for virtual screening of the library of FDAapproved drug molecules and phytochemical compounds from Indian medicinal plants. We identified the best 10 potential FDA approved drug molecules based on (a) their docking scores, (b) their binding poses within hNECD-VS2' interface, (c) their suitability to repurpose based on their drug indications. We also identified best 10 potential phytochemical molecules based on their docking score and binding pose within hNECD-VS2' interface. PyMol (Schrodinger, LLC) and LigPlotþ (Roman A. Laskowski & Swindells, 2011) were used to analyse and visualize interactions between hNECD and the drug molecules as well as phytochemicals.

Modeling of hNECD 3D structure
As mentioned already, experimentally determined 3D structure of the hNECD is not yet available. Therefore, we built a model using AlphaFold v2.1.0 google collab server based on the hNECD protein sequence (please see Figure 3). The hNECD domain of TMPRSS2 harbors the LDLRA and SRCR domains, each harboring six and four Cys residues respectively. These Cys residues in the known structures form disulfide bonds which offer stability to their structures. In the predicted hNECD model, too, we observed that these S-S bonds are retained with known connectivities and topologies. The Ramachandran and ERRAT plot reveal that the model is satisfactory (Supplementary Figure S1 and Figure  S2, respectively).

Protein-protein docking studies
As already mentioned, we hypothesize that VS2' binds to hNECD soon after it is cleaved by the activated serine protease, and this interaction might mediate the endocytosis of the virus. We, therefore, carried out docking studies on hNECD and VS2' structures and identified the best binding pose that corresponds to the lowest binding energy (-385.46 Kcal/mol) as a putative representation of the Interaction between hNECD and VS2'.

hNECD-VS2' Complex is stable throughout MD simulations
We performed MD simulations of the hNECD-VS2' complex for 100 ns and observed that the complex appears to stabilize after 30 ns and thereafter fluctuates with an RMSD value between 0.3 to 0.4 nm (Supplementary Figure S6). In order to ascertain the exact stabilization time, we performed clustering of all the snapshots (from starting till end of the simulation) saved after every 10 ps. We found ten clusters (Supplementary Table 1) of which cluster 1 as the major cluster comprising of snapshots from 35 ns to 100 ns indicating stabilization of the complex.
The Interaction between hNECD and VS2' is stabilized by several H-bonds and salt bridges. Among these, the H-bonds R1000 of hNECD to D203 of VS2' and K112 of hNECD to D867 of VS2' are highly conserved during the period of simulation (present in more than 60% of frames). Among the salt bridges, E990 (VS2')-K234 (hNECD) and D867 (VS2')-K112 (hNECD) are found in 24.70% and 95.15% of time points, respectively Supplementary Figure S7-10. We did clustering of instantaneous structures saved at every ten ps with an RMSD cutoff of 0.2 nm and obtained 10 clusters. The largest cluster has 5596 snapshots with RMSD of 0.186 nm, and we selected the center snapshot of this cluster as the representative of stable hNECD-VS2' complex and used this structure for virtual screening of drugs (Figure 4). This interaction is characterized by a number of polar and apolar interactions whose list is given in Supplementary Table 2, and the list of interfacial residues for both the molecules is given in supplementary Table 3. hNECD interface consists of 24% charged and 53% hydrophobic amino acids and VS2' interface consists of 18% charged and 64% hydrophobic amino acids. Among interaction interface residues, the PPCheck hotspot prediction tool (Sukhwal & Sowdhamini, 2013) predicted K112, R150, G153, P154, F156 of hNECD and F833, 836Q, 837Y, 864 L of VS2' as the hotspot residues.
It has been reported that a pair of Aspartate residues in the SRCR domain of CD163 protein binds to basic residues in its binding partner (haptoglobin-hemoglobin) (Nielsen et al., 2013). In our model too, we find salt bridge interactions between D203 of the SRCR domain of hNECD and R1000 of VS2'.

Virtual screening of FDA approved drugs
We did the virtual screening of FDA-approved drugs that potentially bind to the hNECD binding interfaceusing the Autodock Vina, as mentioned in the methods section. We selected Top10 FDA approved drugs based on their docking scores and binding poses at the hNECD interaction interface of VS2'. These molecules bind to the binding pocket in hNECD and hence mask the residues118I, 129 P, 132 W, 147 R, 149 V, 151 L, 161Y, 166 K, 168 W, 188 M, and 190Y, including the hotspot residues 150 R, 154 P, and 156 F at the binding interface for VS2'. Hence, these molecules have the potential to prevent the interaction of VS2' with hNECD. Figure 5 shows the binding of the top scoring FDA approved drug Ledipasvir with hNECD. Upon examination of the targets and their indications, we found that four of the top 10 drugs may not be suitable for COVID19 treatment based on their drug indications. They are: Ganirelix, Colfosceril palmitate, Indocyanine, and Mivacurium. Colfosceril palmitate is a surfactant used for the treatment of respiratory distress syndrome in premature infants and may not be suitable for adults. Indocyanine is a diagnostic agent. Ganirelix is an antagonist specific to the human target GnRH receptor used in women undergoing assisted reproduction. Mivacurium is a muscle relaxant generally used to induce anesthesia. After removing the above mentioned four drugs, we updated the list of "ten top scoring drugs" by adding the next four (Dutasteride, Everolimus, Telavancin, and Fidaxomicin) from the ordered list of drugs from the results of Autodock Vina.
Drug targets and indications of the updated list of top 10 scoring drugs are given in Table 1 Among the top molecules, Ledipasvir and Ombitasvir are the known inhibitors of NS5A protein in Hepatitis C Virus (HCV) (DeGoey et al., 2014;Kumari & Nguyen, 2015). Lopinavir is an inhibitor of HIV-1 protease (Sham et al., 1998). Temsirolimus is an mTOR inhibitor used in renal cell carcinoma (RCC) treatment (Boni et al., 2007). Maraviroc is a CCR5 co-receptor antagonist and blocks the HIV virus from entering the host cells (Perry, 2010). Revefenacin is a bronchodilator for Chronic Obstructive Pulmonary Disease (COPD) patients by acting as an antagonist for cholinergic muscarinic receptors in airway tissue (Quinn et al., 2018). Dutasteride is an inhibitor of 5-alpha reductase and is used for the treatment of symptomatic benign prostatic hyperplasia (BPH) (Djavan et al., 2005). Everolimus is an mTOR inhibitor used in the treatment of breast cancer and renal cell carcinoma (Franz et al., 2010). Telavancin and Fidaxomicin are antibacterial drugs used in the treatment of hospital acquired bacterial pneumonia and diarrhea associated with Clostridium difficile infection, respectively (Crawford et al., 2012;Laohavaleeson et al., 2007). As can be seen from the details given in Table 1, four of the top hits (Ledipasvir, Ombitasvir, Lopinavir, Maraviroc) are antiviral, two drugs (Telavancin, Fidaxomicin) are antibacterial, and the remaining four (Temsirolimus, Revefenacin, Dutasteride, Everolimus) target human proteins. Of the four drugs that target human proteins, three (Temsirolimus, Dutasteride, and Everolimus) are anticancer drugs, and the remaining one (Revefenacin) has been shown to be a bronchodilator and has been used for COPD patients. Given that these are top scoring drugs, we believe that in the absence of original targets (viral and bacterial targets), these might bind to hNECD at VS2' interaction interface and hence form attractive targets for further in vitro and in vivo studies.
It is interesting to note that among these top hits, Dutasteride has undergone clinical trials for the treatment of mild COVID-19 symptoms in males. As compared to placebo, treatment with this drug has shown a reduction in the duration of the disease accompanied by a reduction in fatigue (Cadegiani et al., 2021). The clinical trials on Sofosbuvir/ Ledipasvir and Lopinavir-Ritonavir combinations on a small number of COVID-19 patients have not shown significant benefits (Cao et al., 2020;Nourian et al., 2020).

Virtual screening of phytochemical compounds
We also performed virtual screening of phytochemical compounds from Indian medicinal plants targeting the hNECD binding interface. Top10 phytochemical compounds were selected based on their docking score, and their list is given in Table 2. Among the top10 phytochemicals, five are from Gamhar (Gmelina Arborea), and this plant was part of 15 herbal ingredients in the Government of India's Ayush formulation called "Agasthaya hareetaki" recommended for the management of respiratory infections (Ahmad et al., 2020). These phytochemical compounds bind at the binding pocket of hNECD and mask the residues 118I, 129 P, 132 W, 147 R, 149 V, 151 L, 161Y, 168 W, 188 M, and 190Y, including the hotspot residue 150 R at the binding interface for VS'. As found in the case of FDA approved drugs, these phytochemicals too mask the interaction of VS2' with hNECD. Figure 6 shows the binding of the top scoring phytochemical Isoarboreol from the Gamhar medicinal plant with hNECD.

Discussion
SARS-CoV-2 entry into host cells has been reported to follow a direct fusion mechanism soon after the cleavage at the second cleavage site in the spike protein by the activated protease of TMPRSS2 . Despite this, literature evidence suggests that an endocytosis process may not be completely ruled out (Burkard et al., 2018;Yap et al., 2015). Taking a cue from the fact that S2' cleavage by the activated serine protease domain, which is tethered to the hNECD by a disulfide bond between Cys244 And Cys365, necessarily positions the hNECD and the primed VS2' at a distance proximal enough for their possible binding with each other. Our modeling, followed by docking studies presented in this communication, are the attempts toward giving a possible model of this Interaction. We propose that the hNECD binds to primed Spike protein and results in SRCR/ LDLRA domain mediated endocytosis of SARS-CoV-2 into host cells. Till now, protein-protein Interaction between the serine protease domain of TMPRSS2 and spike protein has been investigated to understand their molecular interactions involved in the cleavage of spike protein at S1/S2 and S2 cleavage sites (Hussain et al., 2020). However, the interaction between hNECD and VS2' has not yet been captured or reported. This is perhaps due to the inability of the methods, i.e., affinity-purification mass spectrometry, to detect the interactions involving membrane-bound proteins (Gordon et al., 2020). Recent studies on potential peptidomimetics inhibiting TMPRSS2 activity showed a reduction of SARS-CoV-2 infection in human cells previously treated with them (Shapira et al., 2022). This supports TMPRSS2 as a potential target for the prevention of SARS-CoV-2 infection. We have used the interaction interface of hNECD from hNECD-VS2' complex for virtual screening of FDA approved drugs and druggable phytochemical compounds derived from the Indian medicinal plants' database and identified the best 10 potential drugs and best 10 potential phytochemical compounds that could inhibit the Interaction between hNECD and VS2' and thereby disrupting the SARS-CoV-2 virus entry mediated by hNECD into host cells. We mapped known mutations of concern and mutations of interest in SARS-CoV-2 spike protein and observed none of these mutations mapped at the interaction interface of VS2' with hNECD, therefore we can consider this hNECD-VS2' Interaction as a potential therapeutic target for targeting the current SARS-CoV-2 variant of concerns.
Our virtual screening of FDA approved drugs has identified Dutasteride as one of the hits. This drug has undergone a clinical trial for males with mild COVID-19 symptoms with a promising benefit (Cadegiani et al., 2021). Maraviroc which is known to block HIV entry into the host cells, has also been shown by in vitro studies to inhibit early infection of SARS-CoV-2 into host cells (Risner et al., 2020) . In the light of our model, studies could be carried out to investigate the efficacy of these drugs in patients with COVID-19. Our studies have also identified some phytochemical compounds to initiate target i.e., hNECD-VS2' interaction-based studies. It is known that the medicinal plant Gamhar shows antiviral activity. However, its active compound and its possible target have not been well studied, and in this light, our study can be used as the testable hypothesis to conduct further studies to check the efficacy of these phytochemical compounds for the treatment of COVID-19. Figure 5. Binding pose of top FDA approved drug (Ledipasvir) binding to hNECD blocking VS2'. Panel A shows Ledipasvir binding to hNECD with interface residues represented in magenta color. Panel B shows residue level interaction of Ledipasvir with hNECD generated using LigPlotþ (Laskowski & Swindells, 2011). Panel A was generated using PyMol (Schrodinger, LLC). Table 3. The variants of concern SARS-CoV-2 and their respective mutations.
Mapping of known mutations of concern and mutations of interest onto the interaction interface of S2' Mutations with evidence of increasing transmissibility or virulence or decreasing therapeutic/vaccine efficacy are classified as Mutations of Concern (MC), and the mutations suspected of causing a change in transmissibility or virulence or therapeutic/vaccine efficacy are classified as Mutations of Interest (MI) (https://outbreak.info/situation-reports). Mutation E484K has been classified as an MC, and the mutations L18F, K417N, K417T, N439K, L452R, S477N, S494P, N501Y, P681H, and P681R have been classified as MIs. It is interesting to find that none of these mutations (both MC and Mis) map on to the interaction interface of S2' and hNECD.
Delta variant has one mutation D950N, in the S2' region, and this mutation is not present at the interaction interface of S2' with hNECD. Omicron strain has four mutations N856K, Q954H, N969K, and L981F in the S2' region. Out of these four mutations, two mutations, N856K and L981F, are at the interaction interface of S2' with hNECD. Stealth Omicron (BA.2) has only two mutations, Q954H and N969K, in the S2 region and none of these mutations are at the interaction interface of S2' with hNECD. Mutations mapped on spike protein corresponding to the variants of concern are listed in Table 3.
It is to be noted that an attempt was made to investigate the interaction between hNECD and VS2' by expressing the synthetic genes corresponding to the hNECD and VS2' in E.coli using the standard pull-down assays. This study could not identify an interaction between the engineered proteins (Dayananda Siddavattam, Private communication). This result could be considered as a false negative due to the following reason. The proteins of the synthetic genes correspond to the truncated domains devoid of their membrane anchoring parts and other domains. It is very likely that these unanchored isolated proteins do not fold into the required 3D structures under the experimental conditions. Needless to say, the folded structures are quintessential for protein-protein interactions.