Quantum-Chemical Study of the Biogenic Amino Acids

Ten amino acids have been subjected to the quantum chemical calculations using the ab initio MO-LCAO-SCF calculations and semiempirical PM3 method. When the geometry optimization started form the X-ray structure conrming the zwitterionic form, the ab initio calculations in vacuo result in the amino acid (canonical) form with the hydrogen atom attached not to the amine but to the carboxylate group. At the optimum geometry a number of properties were evaluated: dipole moment, dipole polarizability, molecular surface, molecular volume, HOMO, LUMO, ionization energy and electron anity using the ΔSCF approach and their values corrected for electron correlation by the 2 nd –order perturbation theory (MP2). In addition, the molecular electrostatic potential and the charge density have been drawn. These properties have been mutually correlated by employing the statistical multivariate methods: the cluster analysis, the probabilistic neural network classier, the principal component analysis and the Pearson pair correlation.


Introduction
Amino acids occur as building blocks of proteins and other polypeptides. Of amino acids that make up proteins (biogenic amino acids), 10 were selected for the present theoretical study that follows up previous voltametric experiments. According to the ability be synthesized in organisms, they can be classi ed as non-essential (alanine Ala, asparagine Asn, glutamic acid Glu), conditionally essential (arginine Arg, cysteine Cys, glycine Gly, tyrosine Tyr) and essential (histidine His, phenylalanine Phe, tryptophan Trp) [1]. They can be classi ed also by their electric charge at side chain (Arg, His, Glu), polar uncharged side chain (Asn), hydrophobic side chain (Ala, Phe, Tyr, Trp), or special cases (Gly, Cys) [2,3].
Theoretical investigations of aminoacids by contemporary quantum chemical methods con rmed that involvement of several water molecules into the model results the stabilization of the canonical (aminoacid) form preferred in vacuo to the zwitterionic form as observed in solutions [4,5].
Amino acids can either be oxidized or reduced as con rmed by the cyclic voltammetry. The oxidation and/or reduction potentials depend upon pH of the sampled solution [6,7]. These species manifest their redox properties during their spontaneous reactions with iron(II) salts [8,9]: under anaerobic conditions they are capable of liberating colloidal Fe(0) from Fe(II) salts. Thus their reducing ability is associated with their oxidation potential.

Methods
The quantum-chemical calculations have been done using the MO-LCAO-SCF approach in two versions [10]: (i) a semiempirical PM3 method is based upon the parametrization and a zero-differential overlap approximation; this is fast but less reliable; (ii) ab initio calculations were conducted with the STO 6-31** basis set functions. In both cases the full geometry optimization has been performed starting from the geometry con guration as retrieved from the Cambridge Crystallographic Data Centre [11]. In the optimum geometry the set of scalar molecular properties has been evaluated: the dipole moment, dipole polarizability, molecular surface, molecular volume, energies of the HOMO and LUMO pair. Total energies of the molecular cation E + and anion E -(UHF calculations) have been used in evaluating the ionization energy E i (DSCF) and the electron a nity E g (DSCF), respectively, according to formulae In the optimum geometry also the correlation energy has been evaluated via the Moller-Plesset 2 nd -order perturbative method (MP2): E 0 (MP2), E + (MP2) and E -(MP2). This allows the evaluation of the corrected ionization energy E i (MP2) and electron a nity E g (MP2). In the optimum geometry two molecular graphs have been generated: the molecular electrostatic potential and the electron density; these are plotted at a selected constant contour.

Geometry of aminoacids
Aminoacids, in general, can exist or coexist at two basic forms: (i) aminoacid form with the proton attached to the oxygen atom of the carboxyl group; (ii) zwitterionic form with the proton attached to the amine group forming the positively charged ammonium moiety and negatively charged carboxylate site.
X-ray structure determination in the solid state con rms the zwitterionic form. In solution, however, these two forms depend upon the pH of the solution; in neutral pH the zwitterionic form is present. The modelling "in silico" shows that these two forms are close in energy and the xed form could depend upon the method of calculation and also the starting structure for the geometry optimization. In general, the PM3 method reproduces the zwitterionic structure of the system. The ab initio method often turns the C a -NH 3 + group and the carboxylate group -COOin the way that a ve-membered ring {N-C a -C-O-H} is formed where the hydrogen atom is attached to the carboxylate oxygen ( Figure 1). Perhaps polarization functions embodied in the basis set are responsible for such an effect.
The geometries of the X-ray determined molecular structure, optimized geometry, molecular electrostatic potential, and the molecular electron density functions are envisaged in Appendix 1. The calculated molecular properties such as ionization energies and electron a nities (at different level of approximation), dipole moment, dipole polarizability volume, molecular surface, and the molecular volume are comprehensively listed in Table 1. Some experimental data, such as dissociation constants K a1 (carboxylate), K a2 (amine), and the octanol/water partition coe cient P are presented in Table 2.  a Acidity constants K a1 (carboxylic), K a2 (amine), K a3 (special group), experimental data little vary depending upon source; octanol/water partition coefficient P [12].

Characteristic properties of individual species
Glycine. The molecule of the glycine crystallizes in the zwitterionic form. However, the geometry optimization by ab initio method yields the aminoacid form as the most stable con guration in vacuo ( Figure 2). The ionization energy calculated via DSCF approach (208 kcal mol -1 ) differs substantially from the assumption of the Koopmans theorem according to which E i ~ -E(HOMO) = 250 kcal mol -1 . The inclusion of the correlation energy through the 2 nd -order perturbation theory gave the corrected value of E i (MP2) = 238 kcal mol -1 . The electron a nities display less discrepancies, however those data are in principle less accurate. The contour diagram drawn on the molecular electrostatic potentials, in the Appendix, shows the acidic (oxygens, red, negative) and basic (hydrogens, blue, positive) sites. The results obtained by the semiempirical PM3 method copy the ab initio data, except the LUMO and consequently electron a nities E g (DSCF) and corrected E g (MP2). The dipole moment m = 4.6 D, and dipole polarizability volume a = 30 Å 3 adopt expected values for such a slightly polar molecule. l-alanine. In match with expectations, the molecular properties are very similar to glycine. This molecule is a bit more polar m = 5.5 D, and more polarizable a = 40 Å 3 .
l-asparagine. The geometry optimization resulted in the nal form showing a hydrogen bond N-H…O that is a part of the ve-membered ring {N-C a -C-O-H}. This form can be classi ed as "wrapped" or "packed" one. It is even more polar m = 6.7 D, and even more polarizable a = 55 Å 3 .
Cysteine. This is a very different molecule whose structure refers to the "open" or "unpacked" aminoacid form. Ab initio data show negative value of the LUMO which would indicate a spontaneous reduction.
However, the calculated positive electron a nities evaluated via eq. (2), E g (DSCF) = 83 kcal mol -1 and E g (MP2) = 79 kcal mol -1 , do not con rm such a predisposition. There is a rather low polarity m = 3.8 D, and medium polarizability a = 54 Å 3 . Arginine. While the X-ray structure analysis con rms an unpacked form of this molecule, the geometry optimization resulted in the wrapped zwitterionic form with two hydrogen bonds of the carboxylate oxygen atom to the hydrogen attached to the guanidinium group. Large polarity m = 7.3 D, and enhanced polarizability a = 85 Å 3 are predicted. l-phenylalanine. This is the rst member of the series containing an aromatic ring. The geometry optimization resembles the CCDC pattern, however, with the nal packed aminoacid form. Predicted polarity is m = 4.9 D and rather large polarizability a = 93 Å 3 . l-tyrosine. Attachment of the OH group in this molecule does not alter the properties signi cantly: the geometry converged to the packed aminoacid form with polarity m = 6.5 D and polarizability a = 96 Å 3 . l-histidine. This molecule is the only one that retains its zwitterionic form also in vacuo. This causes much increased dipole moment m = 13.0 D but a medium polarizability a = 76 Å 3 .
l-tryptophan. The aminoacid form is more stable in vacuo than the zwitterionic form. Though the dipole moment is the lowest over the studied series m = 3.0 D, the polarizability is the highest a = 117 Å 3 . This molecule displays the lowest ionization energy E i (MP2) = 132 kcal mol -1 .

Application of multivariate methods
The worksheet formed of data from Tables 1 and 2  The PNN classi er rearranges the a priori classi ed input group of the objects (aliphatic and aromatic) into the output group showing the "incorrectly classi ed" cases (Table 3). Just the object 9 (histidine) has lower "distance" d = 0.46 to the aliphatic group and longer d = 0.54 to the aromatic group.  Input group Output group  1  Gly  1  al  al  2  Ala  1  al  al  3  Asn 2  al  al  4  Cys 2  al  al  5  Glu 1  al  al  6  Arg 2  al  al  7  Phe 3  ar  ar  8  Tyr  3  ar  ar  9  His  2  ar  al  10 Trp 3 ar ar a al -aliphatic, ar -aromatic. Finally, Table 4 brings correlation coe cients, r, for pairs of molecular properties. These data in a numerical form con rm the results of the PCA. The members of the group D show r ~ 1, the observable Dip for the group C is rather unrelated to the remaining ones.

Conclusions
Starting from the X-ray structure of 10 biogenic aminoacids with the zwitterionic form, the geometry optimization using ab initio calculations in vacuo results in the amino acid (canonical) form with the hydrogen atom attached not to the amine but to the carboxylate group. There are three exceptions which retain their zwitterionic form: histidine, arginine and asparagine. The statistical multivariate methods con rm a classi cation of the objects into three groups. Unlike three aromatic aminoacids (Phe, Tyr, Trp), histidine resembles the group of aliphatic aminoacids; it is closely related to Asn and Arg. The classi cation of the molecular properties is more variable and covers ve groups according their similarity: the group A = {LUMO, -HOMO, Ei, Eic} shows a close relationships of variables describing the ionization process; B = {Eg, Egc} describe the electron a nity. The group D = {Pol, Sur, Vol} is associated with the molecular topology the members of the groups A+B anticorrelate with those inside the group D: with increasing surface, volume and polarizability the ionization energy and electron a nity decreases. Compliance with ethical standards Con ict of interest The authors declare that they have no con icts of interest.
Ethical approval Not applicable.
Consent to participate Not applicable.
Consent for publication Not applicable.

Figure 1
Optimized molecular geometry in vacuo.

Figure 2
Geometry and molecular electrostatic potential of the glycine molecule. Analogous information about remaining aminoacids are deposited in Appendix A.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.