Quantum Surprises from the Watson-Crick and Hoogsteen G·C Nucleobase Pairs: A Comprehensive QM/QTAIM Investigation

In this study at the MP2/6-311++G(d,p)//B3LYP/6-311++G(d,p) level of theory in the isolated state it was revealed 14 novel physico-chemical mechanisms of the tautomerization of the G·C nucleotide base pairs in the Watson-Crick G·C(WC) / G*·C*(WC), reverse Watson-Crick G*·C*(rWC) / G·C*O2(rWC), Hoogsteen G* ·C*(H) / G*N7·C(H) or reverse Hoogsteen G*·C*(rH) / G*N7·C(rH) configurations into the wobble (wWC, wH) and reverse wobble (rwWC, rwН) base pairs: 1. G·C(WC)↔G·C*(rwWC), 2./3. G*·C*(WC)↔G·C*(rwWC)/G*N2·C*(rwWC), 4. G*·C*(rWC)↔G*·C(wWC), 5. G·C*O2(rWC)↔G·C*(wWC); 6./7./8./9. G*·C*(H)↔G*·C(rwН)/G*·C*O2(wH)/G*·C*O2(rwН)/G*N7·C*(rwН)↔G*·C*O2(rwН), 10. G*N7·C(H)↔G*·C(wH) amino, 11./12. G*·C*(rH)↔G*N7·C*(wН)/G*·C(wН), 13. G*N7·C(rH)↔G*N7·C*(wН)↔G*·C(wН) and 14. G*N7·C*(rwH)↔G*N7·C*(rwH) perp↔G ·C(wH)↔G*·C(rwН) reaction pathways. It was established that the presence in the base pair of the two anti-parallel neighboring H-bonds is a necessary and sufficient condition for the implementation of such transformations, since it enables intermolecular proton transfer between the bases inside the base pair. It was found out that these tautomeric transitions are controlled by the TSs with quasiorthogonal structure, which are tight G·C/G·C ion pairs, joined by at least two parallel intermolecular H-bonds, connected on a common negatively charged endocyclic N/C atoms – proton acceptor. All reaction pathways have been reliably confirmed. These transitions are accompanied by the changing of the mutual cis-orientation of the N9H and N1H glycosidic bonds of the bases on the trans-orientation and vice versa. These data complement the reported earlier mechanisms of the tautomerisations of the classical A·T and G·C DNA base pairs. Experimental verification of the novel G·C nucleobase pairs is looking as an attractive task for the future research.


INTRODUCTION.
The topic of tautomerism is of paramount importance nowadays [1][2][3][4][5][6][7], since, from the one side, it enables to explain the chemical structure of the biomolecules, and, from the other sidetheir functioning in the living cell.
In general, this topic attracted active researchers' attention in different areas of researchdrug design, physics of crystals, in NMR spectroscopy and biologically important molecules [8][9][10][11][12][13]. The point of view, that in biological molecules tautomeric transformations are inseparable from the conformational transformations, is becoming more and more popular last time [3,14].
This enables to open new possibilities for the understanding of the subtle intimate mechanisms of the functioning of the biomolecules in a living cell.
Tautomerism represents especial interest in nucleic acids [15], since transfer of single proton inside the nucleobase pairs leads to the groundbreaking changes in their structure and functioning [4]. Also, tautomerism is usually associated with the mutagenic properties of the molecules [3,16]. Thus, recently it was found [3,18] that sequential single proton transfer inside classical Watson-Crick A·T/G·C and unusual DNA base pairs leads to the Watson-Crick↔wobble transitions and further formation of the wobble base pairs involving rare tautomers, which cause spontaneous point mutations.
In view of all presented in the literature data [2,7,18,19] on the possible mechanisms of tautomerization, it arises quite logical question -"How complete they are?". Just answer on this biologically important question enables to establish which of them are responsible for the spontaneous point mutations and whichfor the other biological roles.
This study is a further development of the previous works [20][21][22], devoted to the tautomerically-conformational transformations of the classical G·C DNA base pairs, leading to the formation of the mutagenic tautomers of the G and C DNA bases.
So, aim of this studyto reveal physico-chemical mechanisms, which define the tautomerization of the G·C nucleobase pair in its four biologically important configurations. These investigations have been performed at the MP2/6-311++G(2df,pd)//B3LYP/6-311++G(d,p) level of QM theory. Obtained data significantly extend existing ideas about the possible biological role of the prototropic tautomerism of the pairs of nucleotide bases in the processes of the functioning of the nucleic acids.

COMPUTATIONAL METHODS.
Density functional theory calculations of the geometry and vibrational frequencies.
Equilibrium geometries of the investigated G·C nucleobase pairs and transition states (TSs) of their mutual conformationally-tautomeric transformations, as well as their harmonic vibrational frequencies have been calculated at the B3LYP/6-311++G(d,p) level of theory [23][24][25][26][27], using Gaussian'09 program package [28]. Applied level of theory has proved itself successful for the calculations of the similar systems [29,30]. A scaling factor that is equal to 0.9668 has been applied in the present work for the correction of the harmonic frequencies of all complexes and TSs of their tautomeric and conformational transitions [31,32]. We have confirmed the local minima and TSs, localized by Synchronous Transit-guided Quasi-Newton method [33], on the potential energy surface by the absence or presence, respectively, of the imaginary frequency in their vibrational spectra. Further, reaction pathways of the conformationally-tautomeric transformations have been confirmed by using the Intrinsic Reaction Coordinate (IRC) procedure [20,21], moving from each TS in the reverse and forward directions.
All calculations have been carried out in the continuum with ε=1 under normal conditions (T=298.15 K) [34,35], that adequately reflects the processes occurring in real biological systems without reduction of the intrinsically inherent structurally-functional properties of the base pairs in the composition of DNA. Moreover, this environment (ε=1) satisfactorily models base pair recognition pocket of the DNA-polymerase machinery, which is substantially hydrophobic.
Single point energy calculations. Geometry optimizations have been followed by the electronic energy calculations as the single point calculations for the optimized geometries of the G·C nucleobase pairs and TSs of their conformationally-tautomeric transformations at the MP2/6-311++G(2df,pd) level of theory [36,37]. QTAIM analysis. Bader's Quantum Theory of Atoms in Molecules (QTAIM) [38][39][40][41][42][43] has been applied by using program package AIMAll [44] in order to analyze the electron density distribution. The presence of the bond critical point (BCP), namely the so-called (3,-1) BCP, and a bond path between hydrogen donor and acceptor, as well as the positive value of the Laplacian at this BCP (Δρ>0), have been considered as criteria for the H-bond formation [38][39][40][41][42][43]. Wave functions have been obtained at the B3LYP/6-311++G(d,p) level of theory.

OBTAINED RESULTS AND THEIR DISCUSSION.
Obtained results are presented in Table 1 and on Figure 1. Their careful analysis revealed the data, which are analyzed in more details below.
First, it would be considered novel pathways of the tautomerizations for the G·C nucleobase pairs, which have Watson-Crick geometry with cis-oriented N9H and N1H glycosidic bonds and reverse Watson-Crick geometry with trans-oriented N9H and N1H glycosidic bonds.  Table   1). quasi-orthogonal tight G -·C + ion pair, which is deprotonated by the N1 and N2 sites, protonated by the O6 site of the G base and joined by four specific intermolecular contactstwo (C)N4 + H…N1 -(G) and (C)N3 + H…N1 -(G) H-bonds, which are locked on the joint N1atom, and two attractive (C)N4 + …N2 -(G) and (C)N3 + …N2 -(G) van der Waals contacts, which are locked on the joint N2atom of the C - (Fig. 1, 4). Second of the afore mentioned base pairs -G·C* O2 (rWC)

Moreover
tautomerically transforms into the wobble Watson-Crick G·C*(w WC ) nucleobase pair, for which pyrimidine C base is exposed into the major groove of DNA accordingly the purine G base. This G·C* O2 (rWC)↔G·C*(w WC ) transition is controlled by the TS G-·C+ G·C*O2(rWC)↔G·C*(wWC)tight G -·C + ion pair with quasi-orthogonal geometry, which is deprotonated by the N1 site of the G base and joined by two intermolecular (C)O2 + H…N1 -(G) and (C)N3 + H…N1 -(G) H-bonds (Fig. 1, 5).
Further, we have considered results, which are attractive from the biological point of view and are concerning novel pathways of the tautomerization of the Hoogsteen and reverse Hoogsteen G·C nucleobase pairs with trans-oriented N9H and N1H glycosidic bonds.
4. It was shown that Hoogsteen G* t ·C*(H) nucleobase pair with cis-oriented N9H and N1H glycosidic bonds and trans-oriented O6H hydroxyl group of the G* base tautomerises into the non-planar reverse wobble Hoogsteen G* t ·C(rw Н ) base pair with trans-oriented N9H and N1H glycosidic bonds -6. G* t ·C*(H)↔G* t ·C(rw Н ) (Fig. 1, 6). This tautomeric transition is controlled by the TS G+·C-G*t·C*(H)↔G*t·C(rwН) , which is tight G + ·Cion pair with quasi-orthogonal geometry, protonated by the N7 site of the G base and stabilized by three intermolecular H-bonds -(G)O6 + H…N3 -(C) and (G)N7 + H…N3 -(C), which are focused on the common N3atom, and (G)N7 + H…N4 -(G). Two last of them are bifurcated from the common N7H group of the G base.
First stage 13. G* t N7 ·C(rH)↔G* t N7 ·C*(w Н ) is controlled by the quasi-orthogonal TS G-·C+ G*tN7·C(rH)↔G*tN7·C*(wН) , which is tight G -·C + ion pair by the participation of the carbanione of the bond, leading to the wobble Hoogsteen G* t ·C(w Н ) nucleobase pair (Fig. 1, 13). Formed G* t ·C(w Н ) nucleobase pair is stabilized by the participation of two intermolecular  (Fig. 1, 14). This reaction is not accompanied by the changing of the mutual orientation of the N9H and N1H glycosidic bonds.
This transition occurs through three TSs: TS G*N7·C*(rwH)↔G*N7·C*(rwН) perp , TS G*N7·C*(rwH) perp↔G-·C+(rwН) , TS G-·C+ G-·C+(rwH)↔G*t·C(rwН) and two dynamically stable intermediates: reverse wobble Hoogsteen G* N7 ·C*(rw H ) perp nucleobase pair and reverse wobble Hoogsteen G -·C + (rw H ) ion pair, which are significantly non-planar structures (Fig. 1, 14). First among TSs -TS G*N7·C*(rwH)↔G*N7·C*(rwН) perp is responsible for the conformational G* N7 ·C*(rw H )↔G* N7 ·C*(rw H ) perp transition, second among TSs -TS G*N7·C*(rwH) perp↔G-·C+(rwН) is responsible for the G* N7 ·C*(rw H ) perp ↔G -·C + (rw H ) single proton transfer and third among TSs -TS G-·C+ G-·C+(rwH)↔G*t·C(rwН) leads to the formation of the reverse wobble Hoogsteen G* t ·C(rw Н ) nucleobase pair (Fig. 1, 14). Now it would be shortly considered non-planar deformation of the G and C nucleotide bases, accompanying investigated here processes. Despite the structural softness of the G and C bases for bending [20,21], cycles of the G and C bases remain planar even at the TSs of the transitions. The largest orientational deformation occurs in the exocyclic fragments -NH 2 amino groups and OH hydroxyl groups. Thus, in particular, piramidality of the NH 2 amino group of the G base significantly increases at the transformation into the G -·C + ion pairit could be explained by the weakness of the electronic conjugation between the N2 amino atom and -electron system of the ring of the base at its deprotonation. Also, it is observed pyramidalization of the NH 2 amino group of the protonated C base. The largest changes in the orientation of the exocyclic groups were observed at the TS G-·C+ G*t·C*(rH)↔G*N7·C*(wН) (Fig. 1, 11).
Attractiveness of these tautomeric transitions, which are accompanied by the changing of the mutual orientation of the glycosidic bonds of the bases from cis-on trans-orientation, consists in the fact that they have quite high energy of interaction at the TSs (ΔG int > 100 kcal·mol -1 ) and thus can reorganize stacking of the neighboring base pairs and significantly change the conformation of the sugar-phosphate residues. In other words, they are perfect pretendents on the role of the drivers of the transition of the DNA and RNA molecules from the states with anti-parallel strands into the duplexes with parallel strands [45].

CONCLUSIONS.
In this study it was performed careful QM/QTAIM investigation, aimed to identify all possible quantum mechanisms of the tautomerization of the classical G·C nucleobase pairs as their intrinsic property, which enable to make the following conclusions.
Conflicts of interest/Competing interests: Not applicable.
Availability of data and material: Not applicable.