Biochemical and Biophysical characterization of the main protease, 3-chymotrypsin-like protease (3CLpro), from the novel coronavirus disease 19 (COVID-19)

Severe acute respiratory syndrome-coronavirus 2 (SARS-CoV-2) is responsible for the novel coronavirus disease 2019 (COVID-19). An appealing antiviral drug target is the coronavirus 3C-like protease (3CLpro) that is responsible for the processing of the viral polyproteins and liberation of functional proteins essential for the maturation and infectivity of the virus. In this study, multiple thermal analytical techniques have been implemented to acquire the thermodynamic parameters of 3CLpro at different buffer conditions. 3CLpro exhibited relatively high thermodynamic stabilities over a wide pH range; however, the protease was found to be less stable in the presence of salts. Divalent metal cations reduced the thermodynamic stability of 3CLpro more than monovalent cations; however, altering the ionic strength of the buffer solution did not alter the stability of 3CLpro. Furthermore, the most stable thermal kinetic stability of 3CLpro was recorded at pH 7.5, with the highest enthalpy of activation calculated from the slope of Eyring plot. The biochemical and biophysical properties of 3CLpro explored here will improve the solubility and stability of 3CLpro for optimum conditions for the setup of an enzymatic assay for the screening of inhibitors to be used as lead candidates in the drug discovery and antiviral design for therapeutics against COVID-19. were obtained over a broad temperature range of 40 °C – 60 °C, where the Eyring plots show an increase in the kU as a function of temperature. The linearity in Eyring plots indicates no signicant heat capacity change between the folded ground state and the transition state of the thermal unfolding of 3CLpro. The Eyring equation, shown below, was used to interpret the temperature dependence of the second-order rate constants of 3CLpro unfolding. The thermodynamic stability of the 3CLpro was measured using Nano-DSC (TA Instruments). The thermogram was acquired at 30 μM protease in different pH values utilizing 100 mM phosphate buffer. The sample was heated at a scan rate of 1 °C/min from 15 °C to 75 °C at 3 atm pressure. The background scans were obtained by loading degassed buffer in both the reference and sample cells and heated at the same rate. The DSC thermograms were corrected by subtracting the corresponding buffer baseline and converted to plots of excess heat capacity (Cp) as a function of temperature. The melting point (Tm) was determined at the maximum temperature of the thermal transition, and the calorimetric enthalpy (ΔHcal) of the transitions was estimated from the area under the thermal transition using Nano Analyzer software from TA instruments. Additional DSC scans were collected at different ionic strength in the presence or absence of 250 mM NaCl or 250 mM MgCl2 in 50mM phosphate buffer at pH 7.5. with and emission at 492 nm and 610 nm, were at 3CLpro in the presence of 3X at different pH values utilizing 50 mM buffer. measurements of the were collected from 25 °C to 80 °C at a xed temperature ramp rate of 1 °C data tted Tm


Introduction
The novel coronavirus disease 19 (COVID-19) is a pulmonary illness caused by severe acute respiratory syndrome-coronavirus 2 (SARS-CoV-2), with symptoms ranging from mild to lethal.1,2 Betacoronaviruses are among the four genera (Alpha-, Beta-, Gamma-, and Delta-) in the Coronaviridae family of positive-sense single-stranded RNA viruses that include SARS-CoV-2. They have one of the largest RNA genomes ~30 kilobases and over 10 open reading frames (ORFs).3,4 Two polypeptides, polyprotein 1a (pp1a) and pp1ab, are synthesized through ribosomal frameshift between ORF1a and ORF1b during translation. In addition to the papain-like protease (PLpro), the 3-chymotrypsin-like protease (3CLpro), also known as the main protease, is one of two proteases responsible for the production of 16 non-structural proteins (nsps).3,5 The nsps play fundamental roles in replication, transcription, and virus recombination during an infection, where inhibiting the proteases will block the release of the nsps and inhibit the progression of COVID-19.4 As a result, 3CLpro of SARS-CoV-2 is an attractive target for the design of broad-spectrum of antivirals against COVID- 19.6 Like PLpro, 3CLpro plays an essential role in the posttranslational processing of SARS-CoV-2 polypeptides for the generation of functional proteins essential for the maturation and infectivity of the virus. Among the coronavirus family, the 3CLpro substrate's binding pocket is highly conserved with glutamine and leucine/methionine required at P1 and P2-positions, respectively, which correspond to the rst and second residues before the cleavage site on the polypeptide substrate. The 3CLpro cleaves the SARS-CoV-2 polyproteins at 11 sites "Leu Gln↓Ser Ala Gly," which ↓ marks the cleavage site.7,8 Multiple crystal structures of 3CLpro have been deposited in the Protein Data Bank, including the recently determined structure in complex with α-ketoamide inhibitors.7 The Betacoronavirus genus contains four lineages A, B, C, and D, with lineage B containing SARS-CoV-2 as well as SARS-CoV that is responsible for the 2003 severe acute respiratory syndrome (SARS) with a mortality rate of 11%. 9 The Middle East respiratory syndrome coronavirus (MERS-CoV) with a high mortality rate of ~40% is part of lineage C Betacoronavirus. 10 The 3CLpro from Betacoronaviruses have identical structural folds, where the active site is highly conserved with a catalytic dyad His41-Cys145. The monomer is split into three domains, with domains I (residues 10-96) and II (residues 102-180) having a ve-stranded antiparallel β-barrel structure with a chymotrypsin-like folding scaffold ( Figure 1A).7 On the other hand, the C-terminal domain III (residues 200-303) has a ve α-helices cluster that is connected to Domain II by a long loop (residues 181-199).
Domain III of 3CLpro from SARS-CoV was identi ed to be important in the dimerization and formation of an active 3CLpro protease.11 The active site of 3CLpro is at the interface between domains I and II, and to better identify the active site, we docked the peptide substrate (SAVLQSGF) of 3CLpro from Porcine epidemic diarrhea virus (PEDV), PDB code 4ZUH, into the structure of 3CLpro from SARS-CoV-2, PDB code 6Y2E ( Figure 1B).7,12 Different from the Ser-His-Asp triad of chymotrypsin, 3CLpro of SARS-CoV-2 has a catalytic Cys-His dyad with the catalytic residue Cys145 at 2.5 Å from the carbonyl carbon of glutamine of the peptide substrate. His41 and Cys145 of the catalytic dyad are part of domains I and II, respectively, and they are 3.6 Å apart, which is an optimum distance to initiate hydrogen bonding interactions ( Figure 1B).
The rst step in the catalytic reaction of 3CLpro is the deprotonation of the thiol side chain of Cys145 by His41 for its nucleophilic attack on the carbonyl carbon of glutamine of the polyprotein backbone. Upon its deprotonation, Cys145 forms a covalent thioester bond with the carbonyl carbon of the substrate's backbone that leads to cleavage of the peptide bond and release of the C-terminal part of the polypeptide substrate7,11,13. Finally, a water molecule facilitates the hydrolysis of the thioester linkage, displacing Cys145 and releasing the N-terminal segment of the polypeptide substrate. The thioester linkage formation is an essential step in the catalytic mechanism of 3CLpro, and it is targeted in the development of antivirals.8 The biochemical and biophysical characterizations of 3CLpro are essential for the identi cation of optimum conditions to be used in the enzymatic assay for the screening of inhibitors that would be further developed as antivirals. Here, we characterize the thermodynamic and kinetic stability of 3CLpro from SARS-CoV-2 under different pH conditions and ionic strengths. 3CLpro was expressed in E. coli and puri ed to high purity. The secondary structural properties and native fold of the enzyme were con rmed by Circular Dichroism (CD) spectroscopy. 3CLpro was thermodynamically stable at a wide pH range of 6.0-10.0, with the highest stability recorded at pH 7.0. Interestingly, the presence of salts in the buffer solution decreased the thermodynamic stability of 3Lpro with magnesium chloride decreasing the stability further than sodium chloride. On the other hand, increasing the ionic strength of the buffer solution by increasing the concentration of NaCl or MgCl2 did not compromise the 3CLpro stability. The thermal kinetic stability of 3CLpro was also investigated, and the rate of thermal protein unfolding for 3CLpro was relatively slow at all pH values tested here with the lowest unfolding rate recorded at pH 10.0.
However, the highest enthalpy of activation was recorded at pH 7.5. The data acquired here suggest biochemical conditions, including neutral or basic pH conditions, in the absence of salt to be used in the enzymatic assay of 3CLpro. The condition observed here will promote optimum conditions to set up highthroughput screening protocols for the identi cation of 3CLpro inhibitors to be developed as antiviral therapeutics against COVID-19.

Results
Puri cation and Circular Dichroism (CD) spectrum of 3CLpro The 3CLpro gene was cloned into pET28b(+) vector, expressed in E. coli, and puri ed using Ni-NTA a nity and size exclusion chromatography to >90% purity based on Coomassie staining SDS-PAGE analysis. The overall expression yield was high, with 5 mg of 3CLpro from one liter of terri c broth culture. The structural integrity of the 3CLpro was veri ed using far-UV circular dichroism (CD) analysis, with the spectrum exhibiting two ellipticity minima at 208 and 222 nm, which is similar to chymotrypsinlike fold with mixed α-helical and β-sheet structures ( Figure 2A).14 After thermal denaturation, the spectrum of 3CLpro changed signi cantly and diminished to a single broad peak with a minimum at 215 nm. The far-UV CD spectrum of 3CLpro was also collected at different pH values to verify if the protein can tolerate a wide pH range. The spectrum of the native 3CLpro did not change at pH 5.0, 7.5, and 10.0, with limited perturbation of its secondary structure ( Figure 2B). The high secondary structural identity of 3CLpro at a different pH ensures proper characterization of the optimum biochemical and biophysical properties for the enzymatic reaction of 3CLpro, with limited interference on the overall protein structural integrity.
The effect of pH on the thermodynamic stability of 3CLpro Differential scanning uorimetry (DSF) was used to determine the melting temperature (Tm) from the global thermal unfolding of 3CLpro in the presence of a reporter dye, SYPRO Orange. The thermal unfolding transitions of 3CLpro were acquired at different pH values by monitoring the increase in uorescence as the SYPRO Orange dye binds to the exposed protein's hydrophobic core ( Figure 3A and 3B). The Tm was calculated at the midpoint of the DSF thermal transitions, with the highest Tm of 51.1 ± 0.4 °C recorded at pH 7.0 ( Figure 3C). Surprisingly, 3CLpro tolerated a wide range of pH values with relatively high thermodynamic stabilities, with an average value of ~50.4 ± 0.6 °C recorded between pH 6.0 and 9.0. The Tm decreased below pH 5.0 and above pH 10.0, with the lowest values of 45.6 ± 0.1 °C and 44.0 ± 0.7 °C recorded at pH 3.0 and 11.0, respectively. The ability of 3CLpro to tolerate a wide pH range of values was also con rmed by differential scanning calorimetry (DSC). The thermograms of 3CLpro acquired by DSC at different pH values exhibited a single transition with the Tm calculated at the apex of the melting peak and the calorimetric enthalpy (ΔHcal) determined from the area under the thermographic peak ( Figure 3D and 3E). Similar to DSF thermal scans, the 3CLpro was stable at a relatively wide pH range of 6.0-11.0, with the highest Tm of 55.0 ± 0.1 °C recorded at pH 7.0 ( Figure 3F). The amplitude of the DSC thermographic transitions did not change signi cantly at the different pH values tested here except at pH 11.0. As a result, the ΔHcal at different pH values were relatively similar in value, with an average of 77 kJ/mol compared to 41 ± 0.4 kJ/mol at pH 11.0 ( Figure 3G). The Tm values determined from DSC were 4 °C higher than those calculated from DSF. The overall stability difference between the different techniques is expected since each relies on a different measurement strategy, where a reporter dye is included in DSF to monitor the global unfolding and exposure of the protein's hydrophobic core. On the other hand, DSC directly measures the thermodynamic parameters acquired from unfolding the protein sample.
In addition to DSF and DSC analyses, the thermal unfolding transition of 3CLpro at different pH values was acquired using CD spectroscopy. The thermal denaturation curve of 3CLpro was monitored by CD spectroscopy with a large change in the CD signal at 222 nm, which was observed upon the denaturation of 3CLpro ( Figure 3H). The Tm was determined at the midpoint of the thermal unfolding transitions of 3CLpro after tting the data to Boltzmann sigmoidal function. The Tm values of 3CLpro at pH of 5.0, 7.5, and 10.0 were 52.0 °C, 53.5 °C, and 56.0 °C, respectively. The Tm values acquired from CD thermal scans were in the same range as those acquired from DSC; however, the highest Tm value acquired from CD spectroscopy was at pH 10.0 compared with pH 7.0 from DSC analysis. The Tm calculated from CD thermal scans can be different from values acquired from other thermal analysis techniques. The signal in far-UV CD spectroscopy (190 nm to 240 nm) is primarily due to the absorption of the amide groups of the polypeptide backbone, where different secondary structures with speci c dihedral angles contribute to the CD absorption. Therefore, the CD absorption signal is related to the protein's secondary structure elements, which will make it different from other techniques, including DSC, with a direct measure of enthalpy values acquired upon protein unfolding. Overall, 3CLpro exhibited relatively high thermodynamic stabilities over a wide pH range as determined by different thermodynamic techniques tested here.
The effect of metal ions and ionic strength on the thermodynamic stability of 3CLpro Similar to the pH effect, the in uence of salts and ionic strength, including monovalent (Na+) and divalent (Mg2+) cations, was investigated for the thermodynamic stability of 3CLpro. DSF was used to acquire the thermal unfolding transitions of 3CLpro at pH 7.5 in the absence or presence of The stability of 3CLpro was also investigated using DSC in the presence of 0.25 M NaCl and 0.25 M MgCl2 at pH 7.5. Similar to DSF analysis, the DSC thermographic peak shifted to low temperature in the presence of slat with a larger destabilization effect for magnesium compared to sodium ( Figure   4D). The Tm of 3CLpro decreased from 54 °C in the absence of salt to 53 °C and 48 °C in the presence of 0.25 M NaCl and 0.25 M MgCl2, respectively ( Figure 4E). However, the ΔHcal slightly decreased from 86 kJ/mol in the absence of salt to 63 kJ/mol and 65 kJ/mol in the presence of NaCl and MgCl2, respectively ( Figure 4F). Overall, the thermodynamic stability of 3CLpro was not affected by increasing the ionic strength upon increasing the salt concentration. However, divalent metal cations (Mg2+) destabilized the thermodynamic stability of 3CLpro more than monovalent cations (Na+).

Denaturation Kinetics of 3CLpro
Isothermal denaturation was used to determine the thermal kinetic of unfolding for 3CLpro by monitoring where kB is Boltzmann's constant, h is Planck's constant, R is the gas constant, T is the absolute temperature, and ΔS ‡ is the entropy of activation. A noticeable change is observed in the slopes of the Eyring plots, which indicate variations in ΔH ‡ with pH, especially at pH 7.5. The ΔH ‡ was determined from the slope of lines ( $∆& ‡ / .), which was 171 kJ/mol, and 233 kJ/mol, and 208 kJ/mol at pH 5.0, 7.5, and 10.0, respectively. As a result, 3CLpro displayed the highest kenotic stability at pH 7.5 even though the rate of protein unfolding was slower at pH 10. The ΔH ‡ represents the energy barrier between the folded ground state and the partially unfolded transition state.15

Conclusion
In the ght against COVID-19 and the spread of SARS-CoV-2, the discovery of antiviral drugs and the development of therapeutics are of great importance. A conserved step in the maturation of coronaviruses is the processing of their replicase polyproteins. One of the key enzymes in the processing of new virus particles of SARS-CoV-2 is the main protease, 3CLpro, which regulates replicase polyproteins processing and the release of functional proteins during virus infection. As a result, 3CLpro makes an attractive target in the development of antiviral therapeutics against COVID-19. In this study, we demonstrated the expression and biochemical and biophysical properties of 3CLpro from SARS-CoV-2 to facilitate optimum conditions for drug screening and development. From DSF analysis, the addition of salts decreased the thermal stability of 3CLpro from COVID-19, with the Tm decreased by 3.6 °C and 6.7 °C in the presence of NaCl and MgCl2, respectively. A similar result was observed from DSC analysis, with a decrease in the Tm of 3CLpro by 6.0 °C in the presence of MgCl2. However, NaCl decreased the Tm by 1.0 °C. The ΔHcal also decreased by ~22 kJ/mol in the presence of NaCl or MgCl2. The destabilization of the thermal stability of 3CLpro was dependent on the type of metal cations, where divalent (Mg2+) cations had a more pronounced destabilization effect on the thermodynamic stability of 3CLpro compared with monovalent (Na+) cations. On the other hand, the thermal stability of 3CLpro was independent of the ionic strength of the buffer solution, where increasing the concentration of sodium or magnesium chloride did not further reduce the thermal stability of 3CLpro. The reduced thermal stability of 3CLpro in the presence of salt may be associated with the destabilization of salt bridges, where it has been shown that ion-pair networks in proteins are responsible for the increased thermal stability of proteins.20,21 The monovalent cations on sodium can neutralize negatively charged residues and interrupt the formation of salt bridges. Still, it cannot form new crosslinked interactions where the higher charge density of divalent cations leads to a higher accumulation and interaction with negatively charged and polar amino acid residues. Therefore, in addition to its ability to disrupt ionic interactions that stabilize the protein structure, the cross-linking effect of divalent cations allows for the formation of new salt bridges, which may enhance protein aggregation and further contribute to the destabilization effect of magnesium compared with sodium on the thermodynamic stability of 3CLpro.21 The thermal kinetic stability of 3CLpro was recorded at different pH values, where the rate of protein unfolding was monitored by CD spectroscopy at different incubation temperatures. The lowest rate of unfolding for 3CLpro was recorded at pH 10.0. The enthalpy of activation (ΔH ‡) calculated from the slope of Eyring plots was positive at all pH values tested here due to the disruption of noncovalent bonding interactions on the protein during the transition from the folded (ground) state to the activated (transition) state. However, the highest ΔH ‡ was recorded at pH 7.5 with a value close to that at pH 10.0, where the later recorded the slowest unfolding rate. The kinetic stability is related to the activation energy, and it is proportional to the size of the kinetic barrier separating the native state from the unfolded state, where an increase in kinetic stability is proportional to the increase in the energy barrier between the folded ground state and denatured or partially unfolded transition state.15 Overall, the highest kinetic stability of 3CLpro was recorded at a basic pH value with relatively similar ΔH ‡ at pH 7.5 and 10.0.
The biochemical and biophysical properties of 3CLpro explored here highlight high thermodynamic and kinetic stabilities at wide pH values with preference to more basic pH values between pH 7.5 and 10.0. However, the presence of salts and especially divalent metal cations destabilized the thermodynamic stability of 3CLpro with no effect observed upon increasing the ionic strength. The properties explored here will facilitate the setup of optimum conditions for the enzymatic assay to be used in the screening and identi cation of inhibitors of 3CLpro to be used in the development of new antiviral therapeutics to limit the spread of COVID-19.

Material And Methods
Expression and puri cation of 3CLpro The recombinant 3CLpro gene was introduced by GenScript Inc (Piscataway, NJ) into pET28b(+) bacterial expression vectors. The expression of the Hisx6-tagged human 3CLpro protein was performed in E. coli BL21-CodonPlus-RIL (Stratagene). The inoculated culture (2-6 liters) was grown in Terri c Broth (TB) at 30 °C until the A600 reached 0.8 in the presence of 100 mg/L kanamycin and 50 mg/L chloramphenicol. The temperature was then lowered to 15 °C until and the expression was induced overnight with 0.5 mM IPTG. The cells were harvested by centrifugation at 12,000 xg at 4 °C for 10 min in an Avanti J26-XPI centrifuge (Beckman Coulter Inc.), then resuspended in lysis buffer (20 mM Tris pH 7.8, 150 mM NaCl, 5 mM imidazole, 3 mM βME, and 0.1% protease inhibitor cocktail from Sigma-Aldrich: P8849). Cell lysis was carried out using sonication on ice, then centrifuged at 40,000 xg for 45 min at 4 °C. The supernatant was loaded on a ProBond Nickel-Chelating Resin (Life Technologies) previously equilibrated with binding buffer (20 mM Tris pH 7.5, 150 mM NaCl, 5 mM imidazole, and 3 mM βME) at 4°C. The resin was washed with 10 column volumes (cv) of binding buffer, followed by 15 cv of washing buffer (20 mM Tris pH 7.5, 150 mM NaCl, 25 mM imidazole, and 3 mM βME). The His-tagged 3CLpro enzyme was eluted from the column with 20 mM Tris, pH 7.5, 150 mM NaCl, 300 mM imidazole, and 3 mM βME in 1 mL aliquots. Finally, the Ni-column fractions containing 3CLpro were loaded onto a HiLoad Superdex 200 sizeexclusion column (GE Healthcare) using an AKTA puri er core system (GE Healthcare). The column was pre-equilibrated with ltration buffer (20 mM Hepes pH 7.5, 150 mM NaCl, and 0.5 mM TCEP). The nal protein was collected and concentrated to ~150 μM based on Bradford assay, and the sample purity was assessed via SDS-PAGE.
Differential scanning calorimetry (DSC) and Differential scanning uorimetry (DSF) The thermodynamic stability of the 3CLpro was measured using Nano-DSC (TA Instruments). The thermogram was acquired at 30 μM protease in different pH values utilizing 100 mM phosphate buffer. The sample was heated at a scan rate of 1 °C/min from 15 °C to 75 °C at 3 atm pressure. The background scans were obtained by loading degassed buffer in both the reference and sample cells and heated at the same rate. The DSC thermograms were corrected by subtracting the corresponding buffer baseline and converted to plots of excess heat capacity (Cp) as a function of temperature. The melting point (Tm) was determined at the maximum temperature of the thermal transition, and the calorimetric enthalpy (ΔHcal) of the transitions was estimated from the area under the thermal transition using Nano Analyzer software from TA instruments. Additional DSC scans were collected at different ionic strength in the presence or absence of 250 mM NaCl or 250 mM MgCl2 in 50mM phosphate buffer at pH 7.5.
In addition to DSC analysis, the Tm of 3CLpro was determined using DSF measurements in the presence of SYPRO Orange uorescent reporter dye using a real-time QPCR instrument (Mx3005P QPCR system, Agilent Technologies, La Jolla, CA Circular dichroism (CD) spectra and kinetic stability analysis.
The CD spectra of 3CLpro were collected in a 100 mM phosphate buffer at pH 5.0, 7.5, and 10.0 from 190 nm-260 nm at 10 nm/sec scanning speed on a ChirascanTM CD spectrometer (Applied Photophysics). The protease concentration utilized for CD analysis was 30 μM and measured using a 1 mm quartz cuvette and 1 nm bandwidth at 25 °C. On the other hand, the thermal denaturation pro les of 3CLpro were determined by the heat induced conformational transition of native to the denatured state by monitoring the ellipticity changes at 222 nm while the sample temperature was increased at a rate of 1.0°C /min. The same sample condition and instrumentation set up were utilized as in the CD spectrum analysis. The thermal transition measurements were conducted at different pH values and normalized to fraction unfolded (FUnf) using the following equation.
where θ is ellipticity of protein at a speci c time, and θN and θD are the ellipticities of native and denatured states, respectively. θN of the native state was obtained before temperature incubation of 3CLpro, and θD was obtained at the end of the measurement and after incubating the protein at 80 °C for one hour. The data were tted to a Boltzmann sigmoidal function and the Tm was calculated at the middle of the transition using the Excel add-on package XL t (IDBS limited, Bridgewater, NJ, U.S. A.) as described previously19.
Finally, the thermal kinetic stability of 3CLpro was determined using isothermal denaturation analysis to calculate the rate of thermal unfolding after incubating the protein sample in 100 mM phosphate buffer at different temperatures 40 °C-65 °C and pH values of 5.0, 7.5, and 10.0. The ellipticity (θ) at 222 nm was continuously collected for 30 min and utilized to calculate the FUnf as described above. The rate of unfolding (kU) was determined from the slope of the line after tting the data to a straight line. Figure 1 The crystal structure of 3CLpro. (A) Cartoon representation of the structural domains of 3CLpro of SARS-CoV-2 (PDB code 6Y2E). Domain I (residues 10-96) is shown in yellow, domain II (residues 102-180) in green, and domain III (residues 200-303) in pink. The peptide-substrate (blue) is shown in ball and stick representation, and it is located at the interface between domains I and II. (B) The active site of 3CLpro with the peptide-substrate in blue and glutamine at P1-site is white. The catalytic residue Cys145, which is part of domain II, is 2.5 Å from the backbone carbonyl carbon of glutamine of the peptide-substrate.

Figures
His41 of the catalytic dyad, which is part of domain I, is 3.6 Å from Cys145. The gure was prepared using PyMol (Schrodinger LLC).