3.1 High throughput screening via molecular docking and scoring
We screen all the candidate proteins from the protein database and present them along with their binding energies via our semi-automated AutoDock Vina tool developed using Python. These proteins are initially selected via keywords for binding to “cortisol” and then fed to the tool for establishing their suitability. Similarly, the proteins with similarity scores evaluated via BLAST scores are set-aside as duplicates. This methodology allows us to consider the entire available list of candidate proteins without excluding any due to limited computational power. The binding affinity results of top candidate PDBs using molecular docking is listed in the Table 1.
Table 1: List of binding energies of top 7 candidates
PDB ID
|
Binding Energy [kcal/mol]
|
6NWL
|
-10.75
|
7M8V
|
-10.5
|
2V95
|
-9.67
|
2VDY
|
-9.28
|
6HGC
|
-9.11
|
6ITP
|
-8.32
|
NewBG – III (Alpha1-antichymotrypsin variant)
|
-7.72
|
Top three scoring candidates listed in Table 2 were selected for further MD calculations and validation by performing umbrella sampling and computing the potential of mean force.
The remaining list of other approximately 35 candidate proteins, considered along with their binding affinity is presented as a comprehensive study for completeness in Table SI -1. These candidate receptors were subsequently ranked according to their binding affinity and a continuous sequence of amino acids are selected as candidate bioreceptors based on their active binding pockets. This biomimetic procedure enables rapid selection of receptors as compared to abinito design of peptides via computational peptidology36.
3.2 Advantages of peptide bioreceptors over conventional antibody based bioreceptors
One of the most common methods of developing biosensors involves using antibodies as the bioreceptor. Antibodies accompany the biological response to disease and injury which facilitates their use in biosensors. Antibody based electrochemical sensors have been traditionally used for the measurement of cortisol and other biomolecules due to their high specificity and sensitivity. However, their limitations such as storage requirement, temperature instability, high cost, cross-reactivity, and batch-to-batch variability have prompted us to explore further. Antibodies can be prone to denaturation and degradation, which can affect their binding affinity and specificity. To address the issues, we propose development of an inexpensive, synthetic peptide which can be considered as an alternative to these antibodies.
Furthermore, a comprehensive design of a biosensor sensor would require a multi parameter optimization platform. A conventional antibody-based sensor may not provide a complete solution which requires design of many other factors such as immobilization to substrate, solubility, sequence length, cost etc. Antibodies are large molecules that are not readily synthesized and can be chemically unstable19,21. Instability can cause errors and inaccuracies in readings of the biosensor. Their relatively larger size limits the number of antibodies that can be placed on the surface of the biosensor. These challenges have motivated our research for improved biorecognition via peptide design.
3.3 Baseline peptide selection protocol and comparison with corticosteroid-binding globulins (CBGs)
The candidate bioreceptor peptide is systematically evolved by selecting a sequence of continuous amino acids (< 40) from the active binding sites of the top three candidate proteins iteratively by imposing constraints such as sequence length, binding affinity with cortisol and interactions with other interfering species such as progesterone, testosterone and glucose. Particularly, the binding affinity of the selected peptide with progesterone since CBGs are known to offer relatively good binding towards this hormone22. The select candidate assessments are listed in Table SI 2 along with the interference studies in Table 3. Subsequently, the selected baseline peptide is assessed and finally validated via molecular dynamics simulations.
The selected baseline peptide is iteratively assessed as presented in Table SI 2 to finally arrive at the baseline peptide. This peptide is 37 amino acids long and selected from protein 2V95 and presented by single letter sequence “CQLIQMDYVGNGTAFFILPDQGQMDTVIALSRDTIDR” for further analysis. The N-terminal is selected as CYS to facilitate thiol bonding with gold electrodes for future sensor development.
3.4 Baseline peptide binding energies with competing species
The baseline peptide is then modelled with competing species such as glucose, progesterone and testosterone as listed in Table 3.
Similarly, the interactions between baseline peptide and testosterone were computed and found to be − 4.29 kcal/mol and compared with binding energy with cortisol − 6.35 kcal/mol.
3.5 Baseline peptide similarity using smart BLAST
Top candidates from the list were selected for further analysis, primarily the proteins 2V95 and 2VDY i.e. RAT and HUMAN CBGs respectively for their relatively higher binding affinity and continuous sequence of interacting amino acids. These proteins were further compared for their similarities via the NCBI (National Library of Medicine) online server Basic Local Alignment Search Tool (BLAST)37,38 method and found to have > 75% similarity altogether especially in their active binding sites for the intended target ligand cortisol as illustrated in Figure S1. BLAST finds regions of similarity between biological sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance. These sequences of 371 and 373 AA long demonstrated promise in considering the biomimetic route as opposed to abinito peptide design via various extremely computationally intensive combinatorial methods30. The selected baseline peptide is subsequently compared with other proteins from database smart BLAST and is observed that there is “landmark match” with the CBGs from both human and rat as illustrated in Figure S2. The authors suggest that these large proteins have evolved in a manner such that they cater to a number of design requirements and various other considerations resulting in macromolecules which are > 370AA sequences long. Furthermore, the natural design considerations for a CBG and that of a bioreceptor are different and result in separate structures. In this work, we propose a lean peptide approx. 1/10th the sequence length with comparable binding affinity and tertiary structure as a baseline candidate to developing the intended bioreceptor. The design considerations for our proposed receptor are limited to bioreceptor development and can therefore can be limited to binding affinity with target ligand, solubility, sequence length, tertiary structure, agglomeration potential, binding with gold electrodes etc.
3.6 Significance of insilico eccrine sweat model
The choice of solvent model i.e., eccrine sweat model27 developed by the same team has a significant impact on the stability and strength of protein-ligand complexes. Various influential mechanisms such as the hydrophobic effect, which caused the hydrophobic regions39 of the baseline peptide to minimize its exposure to water molecules by binding to hydrophobic ligands which can lead to increased binding affinity and hydrogen bonding capabilities which can influence the formation and strength of hydrogen bonds between proteins and ligands as shown in Figure S3. Other properties such as dielectric constant, viscosity and pH of the solvent do impact the binding affinity. Therefore, a full atomistic molecular dynamics simulation is performed in eccrine sweat solution as a validation exercise.
3.7 Validation of efficacy of the baseline peptide via steered molecular modelling
We performed MD simulation of peptide–ligand complex in eccrine sweat solution. The molecules were modelled using CHARMM36 force field and SPC/E water model. The complete system 10 x 10 x 10 nm3 was simulated in steepest descents minimization followed by NPT equilibration for 10ns with 2fs time steps. The evolution of the potential energy, root mean square deviation (RMSD) of completed protein and ligand, and density confirms the stability of the complex and the equilibration of the system as shown in Fig. 3 and Figure SI 4.
The protein was position restrained and the pull simulation conducted by aligning the protein along with positive Z axis which was selected as the reaction coordinate. Subsequently, we generated 25 individual configurations, by pulling cortisol away from the protein over the course of 10ns of MD, saving snapshots every 100ps. The pull rate was constant at 1nm/ns enforced by a spring of 1000 kJ mol− 1 nm− 2.
These individual configurations are then selected such that they are spaced approximately 0.2ns apart and simulated independently to cover a displacement of 5nm. Finally, we employed WHAM, for extracting PMF. The binding energy, ΔG thus computed is simply the difference between the highest and lowest values of the PMF curve, given the values of the PMF converging to a stable value at large distance i.e., selected reaction co-ordinate as shown in Fig. 4.
A close inspection of energetically favourable binding conformations of the native protein 2V95 binding with target ligand cortisol and the baseline peptide have a similar trend using molecular dynamics simulation package GROMACS that performs explicit molecular dynamics simulations in eccrine sweat model. A series of intermediate molecular modelling plots are presented in Figure SI 4.
3.8 Proposed baseline peptide and its advantages
This work is intended to systematically arrive at candidate baseline peptide. In addition to the binding affinity specifically towards cortisol and sequence length, the proposed peptide bioreceptor has multiple other design considerations such as ease of immobilization on gold electrodes, solubility, etc. Furthermore, the CBGs presented in Table SI 2, demonstrate binding affinity towards progesterone and this needs to be corrected in the peptide to prevent cross sensitivity and interference of species.
Other parameters such as pH, solubility and size of the peptide and sequence length are all important considerations and need to be designed for achieving the desired results. Finally, terminating the designed peptide to bind to the electrodes. This is achieved by thiol termination at the N terminal of the baseline peptide. Cysteine residue in our baseline peptide contain thiol groups that can readily form strong covalent bonds with gold atoms on the electrode surface, creating a stable gold-thiol bond. Immobilizing this peptide with an N-terminal free cysteine residue onto a gold electrode can be achieved through self-assembled monolayer (SAM) formation or covalent bonding.
Thus, the proposed peptide “CQLIQMDYVGNGTAFFILPDQGQMDTVIALSRDTIDR” as depicted in Fig. 5 is > 50% hydrophobic in nature and an acidic peptide therefore will require an acidic solvent to dissolve such as acetic acid and subsequently the use of phosphate buffer saline (PBS) to adjust pH to around 6.3 to mimic the mean pH value of eccrine sweat.
-
The RMSD plot in Fig. 3 demonstrates no significant increase despite the short sequence of amino acids as compared to the RMSD of native protein 2V95 as presented in Figure SI 4,
-
The proposed baseline peptide is 1/10th the size as compared with the native protein 2V95
-
The central part of the sequence demonstrates hydrophobicity to improve affinity and stability during electrochemical reactions
-
The cysteine (CYS) amino acid at N-Terminal with reactive - SH group provides ease of binding with gold electrodes
-
Cost of this proposed peptide for 5mg at > 80% purity (HPLC Purification) is approx. USD 400 for 5mg as compared to a conventional Cortisol Monoclonal Antibody (CORT-1) USD 4000
Thus, the proposed baseline peptide can be considered as a and efficient, cost effective, and a viable alternative to antibody based bioreceptors.