Identification of SARS-CoV2 Main Protease Coldspots Suitable for Drug Targeting

Most attempts to target the novel coronavirus SARS-CoV2 are focusing on the main protease 10 (M pro ) 1-9 . However, >19,000 missense mutations in the M pro have already been reported 10 . The 11 mutations encompassing 282 amino acid positions and these “hotspots” might change the M pro 12 structure and activity, potentially rendering novel antivirals and vaccines ineffective. Here we 13 identified 24 mutational “coldspots” that have resisted mutation since the virus was first 14 detected. We compared the structure-function relationship of these coldspots with several 15 SARS-CoV2 M pro X-ray crystal structures. We found that three coldspot residues (Leu141, 16 Phe185 and Gln192) help to form the active site, while six (Gly2, Arg4, Tyr126, Lys137, 17 Leu141 and Leu286) contribute to dimer formation that is required for M pro activity. The 18 surface of the dimer interface is more resistant to mutations compared to the active site. 19 Interestingly, 16 coldspots are found in conserved patterns when compared with other 20 coronaviruses. Importantly, several conserved coldpots are available on the surface of the 21 active site and at the dimer interface for targeting. The identification and short list of these 22 coldspots offers a new perspective to target the SARS-CoV2 M pro while avoiding mutation- 23 based drug resistance.

stage of the pandemic. Although the 24 mutational coldspots are shortlisted from 306 residues, 48 we studied structures of SARS-CoV2 M pro to understand their structure-functional relevance.

50
Coldspots at the active/inhibitor site 51 To analyze the coldspots in and around the active site, we selected five 3D-structures with high 52 resolution (Protein Data Bank (PDB) codes: 6LU7, 6Y2F, 6LZE, 6M0K, and 7BUY), out of 53 several SARS-CoV2 M pro structures, that had been co-crystalized with antiviral drug 54 candidates in recently published studies 3,5,6,9 . However, the inhibitors were not optimal for 55 SARS-CoV2 15 . We believe the non-mutational residues (coldspots) could be appropriate target 56 regions for designing effective inhibitors of SARS-CoV2 M pro . In the SARS-CoV2 M pro 57 structures, domains I (8-101) and II (102-185) play major roles in the formation of the active site and provide binding sites for inhibitors; while domain  is important in the 59 regulation of protease activity 3,9 . The catalytic dyad His41 and Cys145 is located at the active 60 site that forms in a cleft between domain I and II. Most efforts to design anti-viral inhibitors 61 using drug repurposing approaches are focused on targeting this active site 3,11 . We found 15 62 coldspots to be from domains I and II, and the remaining nine were in domain III (Fig. 1c). The 63 inhibitor-binding sites in the five SARS-CoV2 M pro structures were superimposed and are 64 represented as a surface model in the 3D structures (Fig. 1d, 1e and 1f), which show that a total  Interestingly, we mapped three coldspots, Leu141, Phe185, and Gln192, in the 6LU7-N3 72 complex (Fig. 1f). The structural importance of these coldspots was emphasized by the recent 73 X-ray crystallographic studies of 5,7 demonstrating the involvement of the 74 spots in the formation of substrate-binding sites and Phe185 and Gln192 in the stability of the 75 active site. We found coldspots Asn133 and Lys137 beneath the surface formed by the binding-76 site residues (Fig. 1e), specifically, Leu27, Asn119, and Gly146 are near the catalytic dyad 77 (His41 and Cys145). They may provide some support to the catalytic center, as evidenced by 78 a recent study, in which Leu27 was found to play a key role in the activity of the M pro structure 79 of SARS-CoV2 8 . Whereas, Leu27 and Asn119 are involved in the formation of the binding 80 site in SARS-CoV M pro 16 (Table 1). However, based on our data analysis, the other pocket-81 forming residues in the structures undergo mutations, which may modify the shape of the 82 binding pocket. This prediction is supported by a recent study 17 , in which the structures of the mutants Met49Ile, Pro184Leu/Ser, and Ala191Val were shown to substantially deviate from 84 the wildtype. Thus, the residues were assumed to belong within the mobile regions of the active 85 site, which control the conformational changes that may be required for catalysis. This indicates 86 that coldspots are required at the active site to maintain effective targeting.

88
Phe140 (1 mutation), Cys145 (3 mutations), Glu166 (3 mutations), and His172 (1 mutation) 89 showed low mutation frequencies (a total of eleven out of 525 mutations at the active site) (Fig.   90 1g). This suggests that the residues involved in critical functions at the active site are mutated 91 less frequently than other residues.

93
Coldspots at the dimer interface 94 An alternate therapeutic strategy is to design antiviral agents to target the dimerization of the 95 SARS-CoV M pro , as the dimeric form is essential for activity 18,19 and, with 98% identity, is 96 also applicable to SARS-CoV2 M pro 4,7 . Here, we examined the functional relevance of 97 coldspots on the surface of the dimer interface in SARS-CoV2 (PDB code: 6LU7), as they 98 could provide mutation-resistant drug and vaccine target sites (Fig. 2a). Half of the 24 coldspot 99 positions are on the surface of the protease ( Fig. 2a-2b), and the rest are buried. We discovered 100 seven cold spot positions (Gly2, Arg4, Tyr126, Lys137, Leu141, Leu286, and Leu287) on the 101 surface that are involved in the formation of the dimer interface in the SARS-CoV2 M pro (Fig. 102 2c and 2d). They form two sites: the first is based on the positions Gly2, Arg4, Tyr126, Lys137, 103 and Leu141 (Fig. 2c), and the second site includes the positions Arg4, Lys137, Leu286, and 104 Leu287 (Fig. 2d). In the SARS-CoV M pro , these sites include several key interactions, Arg4-105 Lys137-Glu290 20 , Gly2-Arg4-Tyr126 21 , Ser284-Tyr285-L286 22 , and Ser1-Glu166-His163-His172 23,24 , that have been experimentally proven to be vital for maintaining the dimer 107 interface and the active site (Table 1).

109
In SARS-CoV2 M pro , we observed a hydrogen bond between Arg4 and Lys137 (Fig.2c). As 110 both are coldspots (with three other coldspots nearby, Gly2, Tyr126, and Leu141), this appears 111 to be a potential site for inhibition. It also appears slightly similar to the one recently proposed  The other structures of SARS-CoV2 M pro also confirms the functional relevance of the coldspot 120 residues Gly2, Arg4, Tyr126, Lys137, Leu141, and Leu286 that are directly involved in dimer 121 formation through various interactions 7,9 (Table 1). A recent electrophilic screening of 1,250 122 fragments provided three hits (Z1849009686, Z264347221, and POB0073) that bind to the 123 dimer interface, and it was suggested these fragments might be used as quasi-allosteric HCoV-229E 26 and, together with 141, is involved in the regulation of catalytic activity 21,28 .

132
These correlate with our hypothesis that the observed coldspots may serve as mutation-resistant 133 allosteric sites.   (Table 1).

163
Biological relevance 164 It is understood that the SARS-CoV2 M pro is undergoing or accumulating mutations at many 165 hotspots, thus it is essential to identify consistent mutational coldspots that can be targeted with 166 antiviral drugs. In addition, the data of nearly 20,000 global mutations used in this study were 167 collected at the end of the first wave of COVID-19, are minimal. However, the identified 168 mutational coldspots have biological relevance, according to the high-resolution X-ray 169 structures of SARS-CoV2, sequence conservation among CoVs, and experimental evidence 170 provided by the published X-ray structures of other CoV proteases 18,22,23,[25][26][27][28]31,32 (Table 1). 171 We now have a short list of promising targets that might be considered before embarking on 172 time-consuming translational research underlying antiviral design.

174
The observed mutational frequencies in the hotspots at the active site and dimer interface 175 indicate that the virus may be developing protective strategies against inhibitors. This correlates 176 with the findings described in recent reports, which show the positions are changing the shape 177 of the sites via mutations and plasticity 17,33 . However, coldspots that are identified here might be good areas to target. Although some of these coldspots may convert to hotspots in the future, 179 the frequency of the new mutations is likely to be minimal, as the data in this study show the 180 sites play critical structural roles and are mutation-resistant. This is evident from their ability 181 to avoid mutations over 11 months since the virus was first detected. However, further research 182 is warranted for a deeper understanding of the phenomenon.   We acknowledge GISAID for disseminating SARS-CoV2 data. We would like to thank all the 347 communities worldwide involved and supported in the response to the COVID-19 pandemic.