In silico design of antimicrobial oligopeptides based on 3D-QSAR modeling and bioassay evaluation

The emergence of multidrug resistance bacteria poses a great health theat. Therefore, it is a crucial demand to obtain new antibacterial drugs. Antimicrobial peptides (AMPs) have the characteristics of wide antimicrobial spectrum and lower drug resistance, hence, it is hopeful to substitute for classical antibiotics. In this study, two classic methods, comparative molecular field analysis (CoMFA) and comparative molecular similarity index analysis (CoMSIA), were used to analyze the structural feature of AMPs against Staphylococcus aureus or Escherichia coli, respectively. Subsequently, the models established by three-dimensional quantitative structure–activity relationships (3D-QSAR) (for S. aureus, CoMFA: Q2 = 0.512, R2 = 0.943, F = 59.916; CoMSIA: Q2 = 0.645, R2 = 0.993, F = 339.242; for E. Coli, CoMFA: Q2 = 0.507, R2 = 0.913, F = 66.862; CoMSIA: Q2 = 0.573, R2 = 0.966, F = 96.84) have good predictability and stability was constructed. Seven novel small AMPs were designed and synthesized based on the theoretical model. The novel AMPs showed potent antibacterial activity against S. aureus and E. coli while causing low host toxicity. This study provides a potential therapeutic option using 3D-QSAR models guiding the design and modification of novel AMPs, to address the prevalent infections caused by MDR bacterial.


Introduction
Antibiotics have been widely used for their efficiency in the prevention and treatment of bacterial infections [1] for more than 70 years. However, the abuse of antibiotics may cause the bacterial resistance, which can eliminate the effectiveness of these medications [2]. Cases for this are abundant, for example, the vancomycin-resistant enterococci bacteria [3], the methicillin-resistant Staphylococcus aureus [4], and the ampicillin-resistant Escherichia coli [5]. The continuous emergence of various resistant bacteria and even super bacteria has made bacterial infectious diseases an urgent problem to be solved [6,7]. Hence, the development of new, effective, and safe antibacterial drugs is quite necessary.
Antimicrobial peptides (AMPs) are effector molecules of the innate immune system in multicellular organisms that can resist the invasion of exogenous pathogens [8]. These substances were first discovered and studied in the natural immunity of insects around the 1980s [9][10][11]. AMPs can be found in various organisms such as fruit flies [12,13], bees [14,15], marine animals [16], and mammals [17]. So far, hundreds of AMPs have been found and they can be structurally divided into four major classes: α-helical, β-sheet, loop, and extended peptides [18]. Compared with traditional antibiotics, the advantages of AMPs include good thermal stability, high sterilization rate, broad antimicrobial spectrum, inhibition of fungi, parasites, and viruses growth. Therefore, AMPs are potentially an ideal solution to antibiotic resistance.
Traditional experimental methods for identifying antibacterial peptides from polypeptides are expensive and time-consuming [19]. An alternative solution is to use computational methods to screen and predict the AMPs [20,21] Compared with traditional experimental methods, computational methods are more efficient and economical [22]. Heretofore, fruitful achievements have been made by the computational methods in the investigation of AMPs [23].
Among the computational methods, QSAR models are frequently used to predict the activity and properties of an unknown set of molecules. They are essentially regression or classification models, linking biological activity and physical and chemical properties with molecular structures [24,25]. In recent years, researchers have successfully applied the three-dimensional quantitative structure-activity relationships (3D-QSAR) models, which are the extensions of traditional QSAR models by considering the 3D structures of the molecules [26], to evaluate the antibacterial activity of AMPs and provide theoretical guidance for the synthesis of novel AMPs [27]. Therefore, the 3D-QSAR models provided the impetus for designing a new generation of synthetic AMPs.
In this study, we collected the molecular structures of 29 AMPs against S. aureus and 30 AMPs against E. coli from the literature. Two frequently used 3D-QSAR models (comparative molecular field analysis (CoMFA) and comparative molecular similarity index analysis (CoMSIA)) [28] were then constructed and evaluated. Based on these validated models, seven peptides were designed, and tested for their cytotoxicity, plasma stability, and activity against S. aureus and E. coli. The present study can serve as the theoretical support for the future design of novel AMPs.

Results and discussion
The results of 3D-QSAR

Analysis of CoMFA models
The CoMFA model contains two force fields: the steric field (S) and the electrostatic field (E). For both AMPs, through the computational data comparison (Supplementary Tables S1 and S3), the S + E field showed the best result (AMPs against S. aureus: Q 2 = 0.51, np = 5, R 2 = 0.943, SEE = 0.102. AMPs against E. coli: Q 2 = 0.577, np = 3, R 2 = 0.913, SEE = 0.099, the related data are shown in Table 1); for the external validation peptides, AMPs against S. aureus: r 2 pred = 0.746, SDEP ext = 0.471; AMPs against E. coli: r 2 pred = 0.578, SDEP ext = 0.435). The Q 2 for both models are above 0.5, indicating the constructed models could be useful in predicting the antibacterial activity of the peptides. The R 2 in both models are above 0.8, indicating that there is a strong correlation between the experimental and predicted values in the training dataset. The contribution values for the steric field were higher than the electrostatic field in each model, showing that the steric field contributed greatly. Figures 1A  and 2A are plots of the training (square dots) and test (circle dots) activity values of the peptides along with the predicted values of the seven novel peptides (rhombus dots) of the AMPs against S. aureus and E. coli. For both CoMFA models, except for a few outliers, most of the points were uniformly scattered around the curve fitting.

Analysis of the contour maps
AMPs against S. aureus Figure 3A, B shows the contour maps of the steric and electrostatic fields in the CoMFA  Fig. 3A, there are yellow contours on the 1-, 3-, 4-, and 5-places of the peptide, suggesting the amino acid residues with small molecule in these regions are favorable to increase the molecule activity. Green contours can be observed on the 6-place of the peptide, indicating regions where increased steric bulk are predicted to enhance the peptide activity. This can be used to explain why peptide 1 (WWRWRW-NH 2 , consisting of a tryptophan (W) at the 6-place), has greater activity than peptide 19 (RRWWCN-NH 2 ), which has an asparagine (N) at its 6-place (the relative molecular mass (MS) of tryptophan (W) is larger than asparagine (N)). In the CoMFA electrostatic contour map ( Fig. 3B), red contours occur in the 1-, 2-, and 4-places of the peptide and the 5-place of the peptide chain, showing the interaction can be enhanced by negatively charged substituents at these regions. The blue contours on the 5-place of the peptide indicates that positively charged substituents in that position are favorable. The steric, electrostatic, hydrophobic fields are used to construct the CoMSIA contours maps (Fig. 4). In Fig. 4A, the yellow contours on the 1-, 4-, and 6-places of the peptide indicate that the amino acid residues with a small molecular weight at these regions are favorable for the peptide activity. The increased peptide activity can also be contributed by large molecular weight on the 5-place of the peptide and the 6-place of the peptide near the peptide chain as shown by the green contours. In Fig. 4B, there are red areas in the N-end and C-end of the peptide chain, and in the 1-place of the peptide, indicating that negatively charged groups can be added here for increased peptide activity. The blue contours on the 5-and 6-places of the peptide indicate positively charged groups here were favorable. In Fig. 4C, gray areas can be mainly observed on the 1-and 4-places of the peptide, indicating that hydrophilic groups in these areas are favorable for the peptide activity, while light blue region on the 6-place of the peptide means that hydrophobic groups at these regions can contribute to the enhanced activity. The difference in the antimicrobial activity of peptides 1 and 19 can also be explained by the hydrophobic tryptophan group and the hydrophilic asparagine group at the 6-place on these two peptides.
Antimicrobial peptides against E. coli Figure 5 shows the contour maps of CoMFA model (using peptide 20 as a template molecule). Figure 5A shows the steric counter map, green areas on the 4-and 6-places of the peptide and the yellow regions observed far from the peptide chain indicate the regions where increased or decreased steric bulk are predicted to enhance the peptide activity. Figure 5B highlights the areas where increased (red region) and decreased (blue region) electron density, respectively, can lead to the increased activity of the peptides. Take training test peptide 18 (RRWWCD-NH 2 ) and test set peptide 24 (RRRWWW-NH 2 ) in the E. coli set as an example. Aspartic (D) is a negatively charged amino acid and has a relatively smaller molecular weight locates at the 6-place of peptide 18, while tryptophan (W) is positively charged and locates at the same region of peptide 24. In consistent with the observed value, the contour map suggests that peptide 24 has a relatively  Red contours indicate negative charge is favorable, blue contours indicate the opposite greater antimicrobial activity than peptide 18 because positively charged and large molecular weight groups at the 6-place can increase the activity of the peptide. Figure 6 was the contour maps of CoMSIA model. In Fig. 6A, there are red areas on the C-end of the peptide chain and blue areas on the 6-place of the peptide, indicating negatively and positively charged groups, respectively, added here are favorable. Figure 6B shows that in the 1-, 2-and 6-places of the peptide, there were purple areas, describing that hydrogen bond donors were unfavorable at this location. The light blue areas on the 1and 6-places away from the peptide chain mean that A Steric field. Green contours meant sterically bulky groups is favorable, yellow contours mean the opposite. B Electrostatic field. Red contours mean negative charge is favorable, blue contours mean the opposite. C Hydrophobic field. Gray contours mean hydrophilic groups is favorable, color light blue means that hydrophobic group was favorable  A Electrostatic field. Red contours mean negative charge is favorable, blue contours mean the opposite. B Donor field. Purple contours mean hydrogen bond donors are unfavorable, light blue mean the opposite. C Acceptor field. Light purple contours mean the hydrogen bond donor is unfavorable hydrogen bond donors were favorable at these regions. In Fig. 6C, light purple regions can be observed at the C-end and N-end of the peptide chain, showing that hydrogen bond acceptors here are beneficial for increasing the peptide activities. Again, for peptides 18 and 24, at the 3-and 5-position of these peptides are hydrogen bond acceptors cysteine (C), and hydrogen bond donor tryptophan (W), respectively. Based on the contour map, we can deduce the fact that peptide 24 has a greater activity than peptide 18.

Synthesis of novel peptides
According to the established QSAR models, we designed and synthesized seven AMPs. After the synthesis and purification of these peptides, we found the molecular weights detected by the relative MS were fully consistent with the theoretical calculations. Tested by high performance liquid chromatography (HPLC), the purities of the synthesized peptides were showed to reach more than 95%. The predicted values, the purities of the seven AMPs, and the

Antimicrobial activity of the novel peptides
The in vitro antimicrobial properties of seven novel peptides were evaluated against S. aureus and E. coli. As shown in Table 4, the minimal inhibitory concentration (MIC) values of the newly designed peptides range between 8.0 and 64 μg/mL. All the designed peptides showed potential antimicrobial activities. Observing Table  5, it can be found that the MIC values of the newly designed AMPs are somewhat greater than those of the natural peptides collected in the literature and some are not. Part of the reason for this phenomenon is that when designing peptides, this article not only considers the antibacterial activity, but also considers to a certain extent the shortcomings of reducing the hemolytic toxicity of the natural AMPs in the original literature and improving the structural instability of the original literature. Furthermore, we analyzed the relationship between structure and antibacterial activity of these peptides. The structures as well as the observed and predicted pIC 50 values of the novel peptides are shown in Table 5. From this table, we found that there was a little difference between the observed and calculated value of these seven peptides for both models. The observed and predicted CoMFA values of D2-D5 are quite different. The reason may be that we introduced the phenethylamine at the C-terminus when designing the peptide, and the peptides used to construct the model were all natural peptides, and there was no peptide containing the phenethylamine in the training set. It may be based on this that the result of poor forecasting effect is caused. However, the natural peptides we designed still fit the model very well. Moreover, although the prediction result of CoMFA is not very well, the result of CoMSIA is quite reasonable. Forasmuch, these results indicate that the two models were well constructed, providing theoretical basis and support for the design of new AMPs.

Plasma stability assay of the synthesized AMPs
A serious disadvantage of AMPs, which hinders their practical application, is that they are susceptible to be degraded. Based on the results of plasma stability test, three novel peptides with better effects and representativeness (D1-D3) were selected for experimental research. The relative content of residual AMPs in plasma at different time periods was shown in Fig. 7. The residual percentage of AMP RRWWRW-NH 2 in plasma at 15, 30, 60, and 90 min are 81.6%, 65.3%, 49.7%, and 44.3% respectively, while the residuals of peptides D2 and D3 in plasma at 90 min are 97.4 and 97.6%. The results indicate that peptide D1 lacks stability in plasma. However, more than 97% of peptides D2 and D3 remained stable after 90 min of plasma incubation, indicating that these two peptides do not easily degrade in plasma and can maintain a long-term stability to exert their antimicrobial effect. Analyzing the structure of these three AMPs, we found that the two peptides (D2 and D3) with higher stability are C-terminal phenylethylamine derivatives of AMP D1. Therefore, this study speculates that for antibacterial peptides like peptide D1, C-terminal phenylethylamine can be considered in the later structural design and modification to improve its stability in plasma.

Hemolytic toxicity test of the synthesized AMPs
The cytotoxicity of AMPs in normal mammalian cells is another important factor that limits their clinical application. The cationic AMP works by electrostatically attracting the negatively charged phosphatidylglycerol and cardiolipin in the bacterial cell membrane to destroy the bacterial cell membrane structure [29]. However, the cytomembrane of eukaryotic cell also contains a small amount of negatively charged phospholipids (such as phosphatidylserine and phosphatidylinositol), which may cause the combination of AMPs and eukaryotic cell membranes and thus causing cytotoxicity [30].
To determine the cytotoxicity of these novel peptides, the hemolytic activity of these peptides against sheep red blood cells was measured. As shown in Fig. 8, peptides D1 and D2 were not toxic to red blood cells. The hemolytic activity of red   blood cells at the concentration of 512 μg/mL are 2.50 and 4.33% (<10%). For peptide D3, hemolytic activity of red blood cells at the concentration of 64 μg/mL is 5.05%. However, when the concentration reaches 128 μg/mL, the hemolytic activity is 22.88%. Further modifications should be made on this peptide to decrease host toxicity.

Conclusion
In this study, SYBYL 2.1 was used to establish the CoMFA and CoMSIA models to study the 3D-QSAR of 29 AMPs against S. aureus and 30 AMPs against E. coli extracted from the literature. The Q 2 and R 2 values indicate that these models are well established for both two AMPs. The contour maps of the model visually reflect the structure-activity relationship of peptides. Based on these models, we designed and synthesized seven AMPs and predicted their activity values. We then experimentally validated that these seven peptides have certain antibacterial activity against gram-positive bacterium (S. aureus) and gram-negative bacterium (E. coli) with good purity, stability, and low toxicity. Our results indicate that the well-constructed 3D-QSAR models can  Fig. 7 Stability of antimicrobial peptides in sheep plasma Fig. 8 Hemolytic toxicity of antimicrobial peptides in vitro provide important theoretical basis for the design, modification and synthesis of the new AMPs.

Molecular construction and optimization
The 3D structures of all peptides in the training and test sets were constructed using SYBYL 2.1 The Gasteiger-Hückel charge [43] was used to calculate the peptides' charges. The energy minimizations were conducted using the Tripos force field [44] with the max iterations of 1000 and the gradient was 0.005 kcal/ (mol Å). Conformation with the lowest energy was selected as the active conformation.
The "align database" command in SYBYL 2.1 was used for superimposing the collected AMPs. The optimized peptides with the maximum activity (lowest energy) were selected as the template for superimposition. The alignments of AMPs are shown in Fig. 9.

CoMFA and CoMSIA modeling
As classic methods, CoMFA and CoMSIA models are widely used in 3D-QSAR studies. CoMFA and CoMSIA models can reflect the activity of the compounds through two fields (electrostatic and steric field) [45] and five fields (electrostatic, steric, hydrophobic, hydrogen bond acceptor field, and donor field) [46], respectively. The partial least square (PLS) models [47] were derived to analyze the extension of the multiple regression. Cross-validation was performed by the leave-one-out method (LOO) [48] to calculate Q 2 and get the optimum number of components (np). The non-cross-validated correlation coefficient (R 2 ), F values, and error of estimate (SEE) of the model were calculated to evaluate the reliability and predictivity ability of the models [49]. The external prediction ability of the model was evaluated by the predicted r 2 (r 2 pred > 0.5) and external standard deviation error of prediction (SDEP ext ) using the following equations [50]: where y i and ŷ i represent the observed and calculated values, ӯ i is the average of the observed value in the training set, and n ext represents the number of the test set.

Synthesis of novel AMPs
After the sequence of bovine lactoferrin is optimized by design, LfcinBG 4-9 (lactoferrin B 4-9 , RRWQWR) contains only six natural amino acid residues, which not only has higher antibacterial activity but also has low hemolysis. It is an ideal template for the molecular design of AMPs. This research is based on the AMP design theory obtained by the previous research group, which is "short peptide chain, high charge, strong amphipathic, and more dominant amino acids," using LfcinBG 4-9 as a template, and collecting cationic AMPs from the professional AMP database APD2 sequence and antibacterial activity. Combined with some parameters of AMPs, including polar angle, specific secondary structure, amphiphilicity, hydrophobic moment, net charge number, hydrophobicity [51], etc., using bioinformatics and three-dimensional structure-activity relationship analysis methods, a series of rational designs are obtained, antibacterial hexapeptide. In addition, there are reports in the literature that the introduction of phenethylamine can enhance the binding stability of peptides on the one hand [52][53][54]. Therefore, when designing the structure in this article, we try to combine phenethylamine with peptides. Based on the above reasons and according to the models, seven new peptides were designed. The novel peptides were synthesized under the solidphase synthesis method as described [55]. Briefly, the dichloro resin was taken as the carrier in general, wherein halogen chlorine stays at the active site. The resin needs to be dissolved first. Then, the C-end carboxyl of the first amino acid reacts with the active site chlorine on resin. After the first amino acid was connected to resin, it is connected to the second amino acid after dehydration condensation. After that, Fmoc protection was conducted. Operations were repeated according to the designed amino acid sequence, the rest amino acids were connected in sequence, and acetylation of N-end was completed. Finally, the polypeptide was cut from resin with a cutting reagent [56,57] and the naked carboxyl was formed.
Antimicrobial activity assay MIC of each peptide against gram-positive bacterium (S. aureus) and gram-negative bacterium (E. coli) was determined using the broth microdilution assay as described with slight modification [58]. Briefly, midlogarithmic phase cells were diluted to 2.0 × 10 5 CFU/mL in Mueller-Hinton Broth growth medium. Fifty microliters of the diluted cell suspension were mixed in 96-well plates with 50 μL peptide in PBS solution at different stock concentrations (2-512 μg/mL). The suspensions were then incubated at 37°C for 12 h. The growth of bacteria was determined by measuring the absorbance at 600 nm using a microplate reader. MIC was defined as the lowest concentration of investigated peptide that completely inhibited bacteria growth.

Hemolysis assay
The hemolytic activity of each peptide was determined as described with slight modification [59]. Briefly, fresh sheep RBCs were washed three times with normal saline, then resuspended into the 3% red cell suspension. One hundred microliters of sheep red blood cell suspension was incubated with 100 μL peptide solutions at different concentrations ranging from 2 to 512 μg/mL. Sheep red blood cells suspended in normal saline alone were used as negative control, while cells lysed with 0.1% Triton-X100 were taken as positive control. After incubation for 0.5 h at 37°C, the suspension was centrifuged at 3000 rpm for 10 min. One hundred microliters of supernatant was added to 96-well plates and absorbance was recorded at 570 nm. The experiment was repeated three times and the hemolysis ratio was an average value based on the result of three repeats. Hemolysis ratio [60] = [(OD test hole − OD negative hole )/(OD positive hole − OD negative hole )] × 100%.

Plasma stability assay
Twenty-five percent of sheep plasma was incubated at 37°C for 30 min. Two hundred and fifty microliters of sheep plasma was mixed with 50 μL peptide solution at a concentration of 1 mg/mL. The mixture was incubated in a biochemical incubator, shaking at 100 rpm at 37°C. After incubation for 0, 10, 30, 60, and 90 min, 200 μL TFA was added to stop the reaction of peptide in plasma. The mixture was cooled for 30 min at 4°C and then centrifuged at 1200 rpm for 30 min. Two hundred microliters of supernatant was extracted and analyzed using HPLC-QqQ-MS [61]. The 135 V electrospray ionization source was used for scanning.