Dynamic basis for dA•dGTP and dA•d8OGTP misincorporation via Hoogsteen base pairs

Replicative errors contribute to the genetic diversity needed for evolution but in high frequency can lead to genomic instability. Here, we show that DNA dynamics determine the frequency of misincorporating the A•G mismatch, and altered dynamics explain the high frequency of 8-oxoguanine (8OG) A•8OG misincorporation. NMR measurements revealed that Aanti•Ganti (population (pop.) of >91%) transiently forms sparsely populated and short-lived Aanti+•Gsyn (pop. of ~2% and kex = kforward + kreverse of ~137 s−1) and Asyn•Ganti (pop. of ~6% and kex of ~2,200 s−1) Hoogsteen conformations. 8OG redistributed the ensemble, rendering Aanti•8OGsyn the dominant state. A kinetic model in which Aanti+•Gsyn is misincorporated quantitatively predicted the dA•dGTP misincorporation kinetics by human polymerase β, the pH dependence of misincorporation and the impact of the 8OG lesion. Thus, 8OG increases replicative errors relative to G because oxidation of guanine redistributes the ensemble in favor of the mutagenic Aanti•8OGsyn Hoogsteen state, which exists transiently and in low abundance in the A•G mismatch. Replicative errors contribute to genetic diversity needed for evolution but in high frequency lead to genomic stability. Here, NMR is used to show via a kinetic model that DNA dynamics can determine the misincorporation of A•G and A•8OG mismatches.

Replicative errors contribute to the genetic diversity needed for evolution but in high frequency can lead to genomic instability. Here, we show that DNA dynamics determine the frequency of misincorporating the A•G mismatch, and altered dynamics explain the high frequency of 8-oxoguanine (8OG) A•8OG misincorporation. NMR measurements revealed that A anti •G anti (population (pop.) of >91%) transiently forms sparsely populated and short-lived A anti + •G syn (pop. of ~2% and k ex = k forward + k reverse of ~137 s −1 ) and A syn •G anti (pop. of ~6% and k ex of ~2,200 s −1 ) Hoogsteen conformations. 8OG redistributed the ensemble, rendering A anti •8OG syn the dominant state. A kinetic model in which A anti + •G syn is misincorporated quantitatively predicted the dA•dGTP misincorporation kinetics by human polymerase β, the pH dependence of misincorporation and the impact of the 8OG lesion. Thus, 8OG increases replicative errors relative to G because oxidation of guanine redistributes the ensemble in favor of the mutagenic A anti •8OG syn Hoogsteen state, which exists transiently and in low abundance in the A•G mismatch.
On rare occasions, DNA polymerases make copying mistakes that, when left uncorrected, can lead to mutation. These spontaneous mutations contribute to the genetic diversity needed to fuel evolution and adaptation 1,2 . Yet, in high frequency, DNA copying errors can cause genomic instability 3 and cancer-causing mutations 4 . The mechanisms that determine the frequency of copying errors are therefore of great interest for understanding evolution and disease.
It is reasonable to assume that a copying error originates from the copier, namely the DNA polymerase. Indeed, low-fidelity DNA polymerases are more error-prone 5 . In addition, certain mutations in replicative polymerases lead to increased copying errors and mutations linked to colorectal and endometrial cancers 6 . Yet, for normally functioning replicative polymerases, there exists evidence that the molecule being copied, the DNA itself, can contribute to spontaneous copying errors.
The idea that DNA can partake in copying errors dates back to the second paper describing DNA structure 7 , in which it was proposed that through base tautomerization, wobble A•C and G•T mismatches could masquerade as Watson-Crick-like conformational states contributing to replicative errors. It was subsequently shown that modifications such as O 6 -methyl-guanine that substantially increase mutation rates 8 stabilize the tautomeric Watson-Crick-like conformational state of the G•T mismatch 9 and that the Watson-Crick-like conformational state could also form through ionization of the thymidine base 10,11 . Crystal structures showed Watson-Crick-like conformational states of A•C 12 and G•T 13,14 evading fidelity checkpoints and being accommodated within DNA polymerase active sites in catalytically active conformations. Topal and Fresco proposed Watson-Crick-like conformations for many other mismatches, including A•G, A•A and G•G 15 . However, none of these conformational states have, to date, been observed within the active sites of replicative DNA polymerases or have been definitively shown through biochemical means to contribute to replicative errors.
Using solution NMR, we recently visualized G•T mismatches transiently forming sparsely populated short-lived Watson-Crick-like conformations 10,11 . We showed that the probability with which these Article https://doi.org/10.1038/s41589-023-01306-5 A anti •8OG syn Hoogsteen bp 20 in a catalytically active conformation like that observed with the canonical Watson-Crick A•T bp.
Analogously, the misincorporation of unmodified A•G could proceed via A anti + •G syn or A syn •G anti Hoogsteen bps. Both Hoogsteen bps have been observed in X-ray crystal and NMR structures of the A•G mismatch in DNA duplexes under various conditions (Supplementary Table 1). Like A anti •8OG syn , the A anti + •G syn and A syn •G anti Hoogsteen bps have internucleotide distances (10.7-11.0 Å) more like Watson-Crick bps (10.5 Å) than A anti •G anti (11.6 Å; Fig. 1a). However, there are no crystal structures of catalytically active polymerases bound to A•G. There is a crystal structure of human DNA polymerase β in which the incoming non-hydrolyzable dATP is positioned adjacent to an abasic site, with the template G shifted in register due to strand slippage 23 . Whether this unusual unpaired conformation provides the dominant mechanism for misincorporating A•G or if there are additional pathways involving Hoogsteen bps remains unknown. There is a crystal structure of the ribosome of a catalytically active conformation accommodating an A syn •G anti Hoogsteen bp in the second position of the mRNA:tRNA minihelix 24 . The Hoogsteen bp with an internucleotide distance like Watson-Crick bps reinforced the hypothesis that the ribosome can accommodate Watson-Crick-like mismatches, leading to translational errors.
In contrast to G•T, little is known regarding the dynamics of A•G and A•8OG in duplex DNA. Here, using NMR chemical exchange, we found that A anti •G anti (population (pop.) of >91%) in duplex DNA transiently forms sparsely populated and short-lived A anti + •G syn (pop. of ~2% and k ex = k forward + k reverse of ~137 s −1 ) and A syn •G anti (pop. of ~6% and k ex of ~2,200 s −1 ) Hoogsteen bps in two different sequence contexts. 8OG redistributed the ensemble, rendering mutagenic A anti •8OG syn the dominant conformation. A kinetic model where A anti + •G syn is misincorporated following binding in an A anti •G anti conformation quantitatively mutagenic Watson-Crick-like conformational states form determines the probability of G•T misincorporation, its pH dependence and how it varies with chemical modifications 11 . The NMR data combined with kinetic modeling supported the general idea first proposed by Topal and Fresco 15 that the probability with which DNA forms alternative conformations helps determine the probability of misincorporation and, by extension, the mutation rate.
Mutagenic chemical modifications have always provided important clues concerning the mutagenic conformational states responsible for nucleotide misincorporation in unmodified DNA 9,16-18 . 8-Oxoguanine (8OG) is the most frequently occurring form of oxidative damage 19 . This lesion is highly mutagenic because it has a high propensity for misincorporation as A•8OG 20 . Mutations due to 8OG are linked to gastrointestinal cancer, lung cancer, sarcoma and melanoma 21 . The molecular basis for the high mutagenicity of 8OG is not fully understood, but it is proposed to arise because the lesion can form an A anti •8OG syn Hoogsteen conformation with geometry like that of the canonical Watson-Crick base pair (bp). Now consider the unmodified A•G, which is the most frequently misincorporated purine-purine mismatch; it predominantly forms A anti •G anti (Fig. 1a) at physiological pH with an internucleotide distance (~11.6 Å) much greater than that of the canonical Watson-Crick bp (~10.5 Å). High-fidelity DNA polymerases, with their active sites optimized for the Watson-Crick geometry, readily discriminate against A•G 22 . By contrast, A•8OG predominantly forms the A anti •8OG syn Hoogsteen bp in which the 8OG base flips 180° and comes into closer proximity with its partner adenine to form Hoogsteen-type hydrogen bonds (Fig. 1b)   Article https://doi.org/10.1038/s41589-023-01306-5 predicted the kinetics of dA•dGTP misincorporation by human polymerase β, its pH dependence and the impact of 8OG. The results support a potentially general role for DNA dynamics in determining the probabilities of replicative errors, identify several hidden A•G and A•8OG conformations under solution conditions that can also play roles in damage repair and show that 8OG increases replicative errors by increasing the abundance of a sparsely populated, short-lived and mutagenic conformational state in a preexisting dynamic ensemble.

A anti •G anti is the dominant ground state
We initially used solution-state NMR spectroscopy to characterize the dominant conformation of A•G in two different sequence contexts, 5′-GAC (hpGAC) and 5′-GAT (hpGAT; Fig. 1c). These sequence contexts were chosen because they yielded high-quality NMR spectra under two pH conditions (Extended Data Figs. 1 and 2a). We embedded the A•G mismatch in the same hairpin construct used in our prior studies of tautomerization and ionization dynamics of G•T/U mismatches in DNA and RNA duplexes 10,11 . The apical loop serves to stabilize the mismatch-containing duplex and minimize erroneous contributions from melting 25 . For both hpGAC and hpGAT at pH = 7.4 and T = 25 °C, the NMR data unambiguously established A anti •G anti to be the ground-state (GS) conformation ( Fig. 1d and Extended Data Figs. 1 and 2a). A anti •G anti is also the most common conformation of the A•G mismatch observed in our survey of X-ray and NMR structures of DNA duplexes (Supplementary Table 1). We observed the G-H1 imino resonance at ~13.6 ppm expected for A anti •G anti -type hydrogen bonding (Fig. 1d). In addition, the G-C8, G-C1′, A-C8, A-C1′ and A-C2 chemical shifts were consis tent with a neutral anti purine base (Extended Data Fig. 1). Finally, we observed distance-based NOE connectivity including purine-(H8···H2′/H2′′) and A(H2)···G(H1) expected for A anti •G anti . Conversely, we did not observe any NOE signatures characteristic of the syn purine base in a Hoogsteen bp, including the strong purine-(H1′···H8) and A-(H2···H2′/ H2′′) cross-peaks (Fig. 1d,e). These A anti •G anti NMR signatures were robustly observed over a range of temperatures (1-25 °C).

•G syn and A syn •G anti
Several resonances belonging to the A•G mismatch in two-dimensional (2D) [ 13 C, 1 H] HSQC spectra were substantially broadened possibly due to micro-to-millisecond timescale exchange with alternative conformational states. To test this possibility, we performed off-resonance spin relaxation in the rotating frame (R 1ρ ) and chemical exchange saturation transfer (CEST) NMR experiments to characterize short-lived low-populated 'excited states' (ESs) over broad timescales (Methods) 26 .
R 1ρ measures the contribution (R ex ) to the transverse relaxation rate (R 2 ) during a period where a continuous radiofrequency (RF) field is applied with variable power (ω SL ) and frequency (ω RF ). For a system experiencing detectable exchange, relaxation dispersion (RD) profiles depicting the dependence of R 2 + R ex on ω SL and ω RF show a characteristic peak typically centered at the difference between the chemical shifts of the GS and ES (Δω = ω ES - ω GS ). CEST measures the impact of conformational exchange on longitudinal GS magnetization during a relaxation period where a continuous RF field is applied with variable power (ω SL ) and frequency (ω RF ). When applied on-resonance with the ES, the RF field saturates ES magnetization, which can be transferred via conformational exchange to the GS. This reduces the GS signal intensity, typically resulting in a minor dip at ω ES = Δω. A major dip is also observed at ω GS = 0. The R 1ρ and CEST data can be fit to Bloch-McConnell (B-M) equations to determine the exchange rate (k ex = k forward + k reverse ), ES population (p ES ) and difference between the ES and GS chemical shifts (Δω = ω ES - ω GS ). We focus on NMR experiments performed on hpGAC containing a [ 13 C, 15 N]-labeled A•G mismatch at a pH of 7.4 and T of 25 °C unless indicated otherwise. Similar results were obtained for hpGAT (Extended Data Fig. 2). We first present R 1ρ and then CEST data.
If A anti •G anti were to exchange with A syn •G anti , within R 1ρ detection, we would expect RD at A-C8 and A-C1′ as both carbons shift downfield due to anti-syn isomerization of the adenine base 27 . For exchange with A anti + •G syn , we would expect RD at G-C8 and G-C1′ as both carbons shift downfield due to anti-syn isomerization of the guanine base 27,28 and at A-C2 and to a lesser extent A-C8 as both carbons would shift from adenine N1 protonation 29,30 . Indeed, we observed sizeable off-resonance R 1ρ RD for all above-mentioned carbon nuclei (Fig. 2).
The A-C8 and A-C1′ RD data could be combined in a global two-state fit, yielding downfield-shifted A-C8 (Δω of ~2.8 ppm) and A-C1′ (Δω of ~3.7 ppm) chemical shifts consistent with an A syn •G anti ES and with p ES = 6.0 ± 0.1% and k ex = 2,210 ± 70 s −1 (Fig. 2a,c). Although statistically unjustified, similar exchange parameters were obtained for one of the two ESs in a three-state fit, but those of the second ES were poorly defined by the data. To confirm that the A syn •G anti ES is stabilized by Hoogsteen hydrogen bonds, we performed high-power 1 H CEST experiments at 10 °C targeting the imino G-H1, which is expected to experience a notable exchange contribution due to changes in hydrogen bonds when forming the Hoogsteen ES. Indeed, we observed a dip in the G-H1 1 H CEST profile at Δω of ~-1.5 ppm as expected for A syn •G anti and with exchange parameters in excellent agreement with those measured independently using A-C8 and A-C1′ R 1ρ at the same temperature (Extended Data Fig. 3). Here, we observe A syn •G anti under solution, which, until now, has only been observed in select crystal structures of duplex DNA (Supplementary Table 1). In the case of the ribosome, this Hoogsteen conformational state has been observed and implicated in translational errors 24 .
The G-C8, G-C1′ and A-C2 RD data could also be combined in a separate two-state fit, yielding downfield-shifted G-C8 (Δω of ~4.0 ppm) and G-C1′ (Δω of ~4.2 ppm) and upfield-shifted A-C2 (Δω of ~-6.7 ppm) as expected for an A anti + •G syn ES (Fig. 2b,c). These RD profiles were also pH dependent, consistent with a protonated A anti + •G syn ES with a more substantial exchange contribution at a lower pH (Extended Data Fig. 4). Again, a three-state fit yielded similar exchange parameters for one ES, but those of the second ES were poorly defined by the data. The A anti + •G syn conformation has previously been observed by X-ray crystallography and NMR but only under acidic conditions (pH < 7; Supplementary Table 1). Interestingly, this exchange process was unusually slow, and, consequently, the population and exchange rate were poorly determined by R 1ρ .
We overcame slow-exchange degeneracy using 13 C CEST experiments. For G-C8, G-C1′ and A-C2, we observed CEST dips indicative of slow exchange with an ES (Fig. 2b). The CEST data could be combined in a global fit, yielding ES chemical shifts consistent with A anti + •G syn with p ES = 1.4 ± 0.1% and a slow exchange rate of k ex = 136 ± 6 s −1 . To our knowledge, this is the slowest exchange rate observed to date for isomerization of an unmodified bp in a nucleic acid duplex (Fig. 2d).

•G syn and A syn •G anti
We used the mutate-and-chemical-shift-fingerprint strategy 26,31 to verify the Hoogsteen ESs (Fig. 3a). In this approach 27,28 , chemical modifications invert the equilibria, rendering a specific ES the dominant GS. The chemical shifts of the mutant are then compared to those measured for the transient ES using R 1ρ and CEST.
We stabilized A syn •G anti and A anti + •G syn by methylating the purine N1 position using N 1 -methyl-2′-deoxyadenosine (m 1 A) and N 1 -methyl-2′-deoxyguanosine (m 1 G), respectively 27 . These naturally occurring forms of damage 32 were previously used to stabilize A syn •T anti and G syn •C + anti Hoogsteen bps in duplex DNA, respectively. The methyl group destabilizes A anti •G anti by removing a hydrogen bond and causing a steric collision with the partner base but has little effect on the Hoogsteen bp. NMR spectra confirmed that the N1-methylated bases adopt m 1 A syn + •G anti and A anti + •m 1 G syn Hoogsteen bps, including the observation of an intense intraresidue (H1′···H8) NOE cross-peak consistent with a syn base (Extended Data Fig. 5).
Article https://doi.org/10.1038/s41589-023-01306-5 We observed excellent agreement between the chemical shifts measured for the m 1 A syn •G anti and A anti + •m 1 G syn Hoogsteen bps and those measured for the corresponding two ESs using R 1ρ and CEST (Fig. 3b). The larger downfield shift (~6.0 ppm) observed for m 1 A-C8 than the A-C8 ES (Extended Data Fig. 5) is expected and can be attributed to protonation of the base due to methylation 27 . The m 1 A had a negligible effect (Δω < 0.2 ppm) on the G-C8 and G-C1′ chemical shifts, explaining why these nuclei do not sense a sizable chemical Ω2π -1 (kHz)

+
•G syn Hoogsteen bps in hpGAC. a, The spins used to probe A anti •G anti exchange to A syn •G anti are in green. Shown are the off-resonance 13 C R 1ρ and 1 H CEST profiles measured at 25 °C and 10 °C, respectively. b, The spins used to probe A anti •G anti exchange to A anti + •G syn are in blue. Shown are the 13 C CEST profiles measured at 25 °C. All experiments were done at a pH of 7.4 in NMR buffer as described in Methods. RF powers used for CEST and spin-lock powers used for R 1ρ are color coded in a and b. Solid lines in a and b denote the global fits to the data using B-M equations, as described in Methods. Data for the R 1ρ profiles in a are presented as values ± 1 s.d. from Monte Carlo simulations for one measurement, as described in Methods. Data for the CEST profiles in a and b are presented as mean values ± 1 s.d. (smaller than the data points) from n = 3 independent measurements of the peak intensities at zero relaxation delay, as described in Methods. c, The population and exchange rate (k ex = k forward + k reverse ) obtained from fitting the R 1ρ or CEST data reveal two distinct (denoted in green and blue bars) ESs. Data are presented as mean values ± 1 s.d. from Monte Carlo simulations (number of iterations = 500) for one R 1ρ or CEST measurement as described in Methods. d, Comparison of k forward and k reverse for A syn •G anti (green) and A anti We could independently stabilize A anti + •G syn by lowering the pH of the unmodified hpGAC duplex to 5.4, and the resulting chemical shifts for A-C8, A-C2, G-C8 and G-C1′ were also in excellent agreement with those measured for the ES using RD (Fig. 3b and Extended Data Fig. 5). These results indicate an apparent pK a of ~6-7, in good agreement with values previously reported for A anti + •G syn in two RNAs and in a DNA duplex 29,30,33 .
Interestingly, according to the m 1 G trap and low pH measurements, A anti + •G syn should be accompanied by a large Δω of ~2 ppm for A-C8 (Extended Data Fig. 5). However, at pH 7.4, B-M simulations show the A anti + •G syn contribution to A-C8 RD to be masked by the larger contribution from A syn •G anti (Extended Data Fig. 4). We could resolve the A anti + •G syn contribution to A-C8 RD by lowering the pH from 7.4 to 6.9, and the resulting Δω of ~3 ppm chemical shift agreed with predictions from the trap (Fig. 3b).

8OG redistributes the ensemble in favor of A anti •8OG syn
In contrast to A•G, A•8OG forms the Hoogsteen A anti •8OG syn bp in 41 of 42 surveyed X-ray crystal structures (Supplementary Table 2), and this is the only conformation observed by NMR 34 . Why 8OG prefers the syn over the anti conformation is not entirely understood, but it has been speculated that syn-8OG avoids steric hindrance and unfavorable electrostatic interactions between the 8-oxo group on the C8 and 5′-phosphate group of 8OG 20 . Interestingly, there is one crystal structure of an A syn •8OG anti Hoogsteen bp within a DNA duplex bound to Dpo4 polymerase 2 bp away from the active site 35 . This leaves open the possibility that A anti •8OG syn exchanges with alternative conformations, with potentially important implications for its misincorporation and damage repair.
In addition, we observed an ~3.5 ppm downfield-shifted 8OG-C1′ indicative of a syn conformation (Extended Data Fig. 6). In contrast to A anti + •G syn , the A anti •8OG syn bp does not require adenine protonation, as the lesion protonates G-NH7.
Interestingly, A•8OG resonances were also broadened, indicating that A anti •8OG syn might undergo conformational exchange. To test this possibility, we performed off-resonance 13 C R 1ρ and 13 C CEST on an hp8OG DNA duplex containing an A•8OG mismatch with a ∆ω (ppm)  Article https://doi.org/10.1038/s41589-023-01306-5 a 13 C, 15 N-labeled adenine within the same 5ʹ-GAC sequence context (Fig. 4a) at a pH of 7.4 and T of 25 °C. Indeed, we observed RD at A-C8 and A-C1′ but not at A-C2 (Fig. 4c).
Fitting of the data revealed two distinct ESs. The ES sensed by A-C8 (Δω of ~1.0 ppm) is consistent with back exchange to form the non-mutagenic A anti •8OG anti , the GS of the undamaged A•G mismatch (Fig. 4d). Remarkably, the A anti •8OG anti ES has a sizeable equilibrium population of 15 ± 6%, with a slow k ex of 202 ± 14 s −1 . The second ES sensed by A-C1′ (Δω = 3 ± 1 ppm) is consistent with A syn •8OG anti and has a lower population of 0.5 ± 0.2%, with a k ex of 4,800 ± 1,300 s −1 (Fig. 4e). The A anti •8OG anti and A syn •8OG anti ES chemical shifts were in excellent agreement with those measured for A anti •G anti and m 1 A syn + •G anti , respectively. Thus, A8OG can also adopt the A anti •8OG anti and A syn •8OG anti conformations in addition to A anti •8OG syn .

Misincorporation via A anti
Several crystal structures of polymerases in catalytically active conformations show that the oxidatively damaged A anti •8OG syn Hoogsteen bp is accommodated much like a canonical Watson-Crick A•T bp. A•G misincorporation could potentially proceed via the A anti + •G syn or A syn •G anti Hoogsteen intermediates, which have Watson-Crick-like internucleotide distances (Fig. 1a). This added  DNA conformational step could explain why A•G misincorporation can occur with an ~100-fold slower kinetic rate than the A•T Watson-Crick bps. Such a kinetic mechanism involving tautomeric and anionic Watson-Crick-like conformational states has recently been shown to quantitatively predict the mis incorporation kinetics of G•T mismatches across a range of pH conditions and chemical modifications 11 . We used kinetic simulations to test the plausibility of A anti + •G syn (model 1) and/or A syn •G anti (model 2) as mutagenic intermediates during the misincorporation of A•G by human polymerase β (Fig. 5a and Extended Data Fig. 7). We focused on human polymerase β because a detailed kinetic model is available for correct dT•dATP 36 , there are presteady-state kinetic experiments examining misincorporation of dA•dGTP using the same 5′-GAT context used in our NMR study of hpGAT 37 , and this polymerase has been implicated in the repair of A•8OG lesions and by extension A•G 38   short-lived Hoogsteen conformational states could explain the ~100-fold slower kinetic rate of misincorporation of dA•dGTP than dA•dTTP (Fig. 5b).
We developed our kinetic model using the same principles underpinning our prior model for G•T misincorporation 11 . dGTP initially binds to the DNA polymerase-template complex in the dA anti •dGTP anti GS with DNA polymerase in the ajar state (Fig. 5a).The nucleotide base of either the template dA or bound dGTP then isomerizes into the syn conformation to form the Hoogsteen intermediate with kinetic rate constants determined by the NMR data measured in the unbound DNA duplex (Fig. 3a). Once the Watson-Crick-like mutagenic Hoogsteen intermediate is formed, all subsequent steps, including transition of the polymerase into the catalytically active closed conformation and the chemistry step ( Fig. 5a and Extended Data Fig. 7), are assumed to proceed with the same kinetic rate constants measured for the reference canonical dA•dTTP Watson-Crick bp 11 . Any non-mutagenic Hoogsteen intermediate is allowed to dissociate from the polymerase, just like the G•T wobble was allowed to dissociate from the DNA polymerase ajar state in our prior model 11 . For example, in model 1 in which only dA anti + •dGTP syn is misincorporated, dA syn •dGTP anti can unbind the polymerase from the ajar state. We tested models in which dA anti + •dGTP syn (model 1), dA syn •GTP anti (model 2) or both (model 1 + 2) could be misincorporated (Fig. 5a).
Strikingly, model 1, where dA anti + •dGTP syn , which mimics mutagenic dA anti •d8OGTP syn , is misincorporated, quantitatively predicted the ~100-fold slowdown in k pol,incorrect . Conversely, model 2, where dA syn •dGTP anti is misincorporated (Fig. 5a), overestimated k pol,incorrect by >30-fold, owing to a much higher population and exchange rate for dA syn •dGTP anti (Fig. 5b). Model 1 + 2, where both dA syn •dGTP anti and dA anti + •dGTP syn can be misincorporated, also overestimated k pol,incorrect by >30-fold due to the higher population and exchange rate of forming dA syn •dGTP anti , making it dominate the process in model 1 + 2 over the formation of dA anti + •dGTP syn . If protonated dA anti + •dGTP syn were the mutagenic conformation responsible for misincorporating dA•dGTP by polymerase β, the misincorporation rate should diminish with increasing pH. Such pH-dependent misincorporation kinetics helped previously identify anionic Watson-Crick-like G•T. Indeed, based on a prior presteady-state kinetic study, k pol,incorrect for dA•dGTP decreased ~20-fold when increasing the pH from 7.7 to 8.4 (refs. 37,39), whereas it changed by less than 2-fold for the reference Watson-Crick bp. Our model can explain this pH dependence given the pH dependence of the mutagenic DNA dynamics forming A anti + •G syn (Extended Data Fig. 4). We tested our kinetic model by simulating data at a high pH (Methods). Remarkably, our model quantitatively predicted the ~20-fold decrease in k pol,incorrect from near-neutral to alkaline pH (Fig. 5b).
A•8OG is misincorporated at an ~20-fold higher rate than A•G 40 . Our NMR data show that 8OG redistributes the ensemble to favor the mutagenic A anti •8OG syn Hoogsteen conformation. We tested whether a variation of model 1 (model 1′) where dA•d8OGTP binds the polymerase and is subsequently misincorporated in the dA anti •d8OGTP syn Hoogsteen GS could predict these rate enhancements (Fig. 5a). Again, the kinetic model quantitatively predicted the ~20-fold enhancement in k pol,incorrect for 8OG relative to G (Fig. 5b). We also applied similar variations of model 2 (model 2′ and model 3′) in which dA•d8OGTP binds the polymerase as the dA anti •d8OGTP syn Hoogsteen GS and only dA anti •d8OGTP anti (model 2′) or dA syn •d8OGTP anti (model 3′) is accepted as the mutagenic conformation. Indeed, while the simulation results from model 3′ can definitively rule out dA anti •d8OGTP syn as the mutagenic conformation, those from model 2′ cannot entirely rule out dA anti •d8OGTP anti . This is due to the high equilibrium population (pop. ~15%) of A anti •8OG anti , which results in k pol,incorrect values that are only approximately twofold smaller than those simulated for A anti •8OG syn in model 1′. Nevertheless, given that our results clearly rule out A anti •G anti as a mutagenic conformational state in the case of unmodified A•G, taken together, our results support the accepted model with A anti •8OG syn as the most likely mutagenic state.
Similar results were obtained when simulating the overall fidelity of polymerization (Extended Data Fig. 8). Taken together, these results provide compelling evidence for a kinetic pathway for A•G misincorporation via a Hoogsteen bp in human polymerase β and indicate that 8OG increases replicative errors by increasing the abundance of a preexisting, sparsely populated, short-lived and mutagenic A anti + •G syn Hoogsteen bp found in the undamaged A•G mismatch.

Discussion
Although no crystal structures are available for a paired A•G mismatch bound to a catalytically active polymerase, there are several crystal structures of DNA polymerases, including I 20 , λ 41 , ι 42 and β 17,43 , accommodating the A anti •8OG syn Hoogsteen bp in a closed catalytically active conformation. In many of these structures, the polymerase makes minor groove contacts with the 8-oxo group of the C8 atom on 8OG like those formed with the 2-oxo group of thymidine. Such interactions can explain the ~25-fold tighter apparent K d for A•8OG than for A•G. However, the ~20-fold higher k pol for A•8OG than that for A•G is most likely because A anti •8OG syn conforms to the Watson-Crick geometry, enabling DNA polymerases to transition into the catalytically active closed conformation. In the same way, A anti + •G syn can conform to the Watson-Crick geometry, but because it forms slowly and in low abundance, isomerization becomes rate limiting, resulting in the ~100-fold slowdown in misincorporation kinetics relative to the correct Watson-Crick bp (Fig. 6a). This slowdown due to Watson-Crick-like DNA dynamics is analogous to that observed for tautomeric and anionic Watson-Crick-like G•T 9,11-14 , pointing to a potentially general role for DNA dynamics in helping determine the frequency of misincorporation. Beyond replication, Watson-Crick-like conformational states including the A syn •G anti Hoogsteen bp have also been proposed to contribute to translation errors 24 , suggesting a broader role for bp dynamics in the fidelity of information transfer across the central dogma.
While our kinetic simulations indicate that A anti + •G syn provides a plausible mutagenic pathway for A•G misincorporation by human polymerase β and possibly other polymerases 17,20,41-43 , our results do not rule out alternative pathways involving melted A•G conformations as observed for an X-ray structure in the 5′-CAG sequence context 23,44 or the A syn •G anti Hoogsteen, depending on sequence context, polymerase and physiological conditions. In addition, our kinetic model assumes that the Watson-Crick grip of DNA polymerase 45 is similar to that of a canonical Watson-Crick embedded in a duplex, but Hoogsteen dynamics can possibly vary within the polymerase environment 46 . For example, the polymerase favors the anti conformation for the template base, and this might partly explain why dA syn •dGTP anti is disfavored during incorporation. Future studies should examine the role of the template and sequence context on bp dynamics and kinetics of misincorporation and overcome challenges in measuring chemical exchange in these large ternary protein complexes, which are difficult to produce at high concentrations needed for the NMR experiments. They should also more broadly examine how other sequence and structural contexts, such as distance from terminal ends, and other motifs, such as apical loops, impact the measured dynamics.
Our results show that 8OG redistributes a three-state dynamic ensemble formed by the unmodified A•G mismatch to favor the mutagenic form (Fig. 6a). The A anti •8OG syn GS back exchanges to form non-mutagenic A anti •8OG anti (pop. of ~15%) and to a lesser extent A syn •8OG anti (pop. of ~1%; Fig. 6a). These results reinforce an emerging perspective that chemical modifications do not create new conformational states but rather redistribute a preexisting dynamic ensemble in favor of a particular conformation 47 . This ensemble perspective is important; the extent of 8OG mutagenicity could vary on a continuous scale depending on the degree to which the lesion stabilizes the mutagenic A anti •8OG syn conformation relative to other non-mutagenic states Article https://doi.org/10.1038/s41589-023-01306-5 in the ensemble. Because non-mutagenic A anti •8OG anti is ~15% populated, there is considerable room to increase or decrease the mutagenicity by tilting this conformational equilibrium 11 . Future studies should examine how these dynamics vary with sequence context and whether this in turn results in corresponding sequence-dependent changes in misincorporation probabilities.  To mitigate the high frequency of A•8OG misincorporation, the GO repair pathway 38 evolved to repair 8OG in the context of the A•8OG mismatch. The ESs of A•8OG uncovered here are also likely to play roles as intermediates during 8OG damage repair. In this pathway, MutY/ MUTYH recognize and excise the adenine base from A•8OG 38 . In a crystal structure of the recognition complex 48 , MutY and MUTYH initially bind to the A anti •8OG syn GS conformation (Fig. 6b). In another crystal structure 49 , MutY and MUTYH stabilize 8OG in an anti conformation, while the adenine adopts an extrahelical syn conformation bound within the enzyme active site poised for glycosidic bond cleavage (Fig. 6b). Because there is no room for the adenine to isomerize to the syn conformation within the MutY/MUTYH active site, isomerization most likely occurs before extrahelical flipping. This could occur through the same double isomerization from A anti •8OG syn to A syn •8OG anti (Fig. 6b) observed by our NMR RD measurements (Fig. 4e). Alternatively, 8OG could initially isomerize to form A anti •8OG anti followed by flipping of the adenine to form A syn •8OG anti (Fig. 6b). Thus, the same DNA dynamics contributing to copying errors could be used to aid the repair of mistakes. Future studies should examine the potential role of these DNA dynamics in determining the kinetic and thermodynamic propensities of 8OG damage repair.

Online content
Any methods, additional references, Nature Portfolio reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/s41589-023-01306-5.

Protein Data Bank structural survey
All X-ray structures with a resolution of ≤3.0 Å and NMR biological assemblies containing DNA molecules (including unbound DNA, DNAprotein complexes and so on) were downloaded from RCSB Protein Data Bank (PDB) on Feb 2018 and processed by X3DNA-DSSR 51 to generate a searchable database containing DNA structural information. Potential candidates of A•G bps were identified using the following filters in the database: (1) A•G bps are located in the helical region of the DNA defined as at least 2 bps away from other features, such as terminal ends, apical loops, junctions and so on; (2) the A•G bp is cis according to the Leontis-Westhof classification 52 and (3) the A•G bp contains hydrogen-bond distances less than 3.5 Å. We then manually inspected all the A•G bps and removed the misregistered bps, tandem A•G/G•A bps and bps that were not located in the helical region of the DNA. This same approach was applied for the A•8OG mismatches. The PDB structural survey results are summarized in Supplementary  Tables 1 and 2.

Sample preparations
Unmodified and unlabeled DNA oligonucleotides. Unlabeled and unmodified DNA oligonucleotides were purchased as singlestranded oligonucleotides from Integrated DNA Technologies with standard desalting purification. The sequences of the oligonucleotides are 5′-GCAGACGCGAAGCGGCTGC-3′ for hpGAC and 5′-GCAGATGCGAAGCAGCTGC-3′ for hpGAT. The secondary structure of the sequences is shown in Fig. 1c.
Modified DNA oligonucleotides. Unlabeled DNA oligonucleotides containing m 1 A, m 1 G and 8OG were purchased from the Yale Keck Oligonucleotide Synthesis Facility, with cartridge purification for the m 1 G single strand, HPLC purification for the m 1 A single strand and cartridge purification for the 8OG single strand using commercially sourced amidites from Glen Research. The oligonucleotides were synthesized on an oligodeoxyribonucleotide synthesizer. The sequences for the m 1 G and m 1 A strand matched those for hpGAC, where the m 1 G replaced the mismatched guanine at position 15, and the m 1 A replaced the mismatched adenine at position 5. 13 C, 15 N-labeled DNA samples. Selectively 15 N, 13 C-site-labeled DNA oligonucleotides were purchased from and synthesized by the Yale Keck Oligonucleotide Synthesis Facility, with commercially available 2′-deoxyadenosine ( 13 C, 98%; 15 N, 98%) and 2′-deoxyguanosine ( 13 C, 98%; 15 N, 98%) phosphoramidites purchased from Cambridge Isotope Laboratories. Cartridge purification was used. The hpGAC sequence was site labeled at positions 5 and 15 for the adenine and guanine, respectively. The hpGAT sequence was site labeled only at position 5 for the adenine.
NMR buffer. Sodium phosphate buffer for NMR experiments was prepared by the addition of equimolar solutions of sodium phosphate monobasic and dibasic salts, sodium chloride and EDTA to give final concentrations of 15 mM sodium phosphate, 25 mM sodium chloride and 0.1 mM EDTA. The pH was then adjusted by adding phosphoric acid or sodium hydroxide. The buffers were then brought up to the desired volume, vacuum filtered and stored for usage.
Sample annealing and buffer exchange. Oligonucleotides were resuspended in water and annealed by heating at a temperature of 95 °C for ~5 min, followed by cooling on ice for ~1 h. Samples were then exchanged into the desired buffer (25 mM sodium chloride, 15 mM sodium phosphate and 0.1 mM EDTA at the desired pH) using Amicon Ultra-4 centrifugal concentrators (4 ml; Millipore Sigma) with a 3-kDa molecular weight cutoff to a final volume of ~250 μl. Deuterium oxide (10% (vol/vol)) was added to the samples before the NMR experiments.

Nuclear magnetic resonance spectroscopy
All NMR experiments were performed using TopSpin 3.2 on 600-and 700-MHz Bruker Avance spectrometers equipped with HCNP and HCN cryogenic probes, respectively. All experiments were performed at a pH of 7.4 and at a temperature of 25 °C in NMR buffer unless stated otherwise. The NMR data were processed and analyzed using NMRPipe 53 and SPARKY (T. D. Goddard and D. G. Kneller, SPARKY 3, University of California, San Francisco), respectively.
Off-resonance 13 C R 1ρ . Off-resonance R 1ρ experiments were performed using a one-dimensional (1D) selective excitation scheme using selective Hartman-Hahn transfers to excited signals corresponding to nuclei of interest 26,54 . Magnetization corresponding to the spins was allowed to relax under an applied spin-lock field for a maximal duration (<60 ms for 13 C) chosen to achieve ~70% loss in signal intensity at the end of the relaxation period. The signal intensity was recorded for four to seven delays equally spaced over the relaxation period. Spin-lock powers used for R 1ρ measurements ranged from 100 to 3,000 Hz. Absolute offset frequencies were chosen ranging from zero to four times the given spin-lock power. The experimental conditions for all the 13 C R 1ρ RD experiments (temperature, magnetic field and solvent) are summarized in Supplementary Table 3.
Analysis of R 1ρ relaxation dispersion data. R 1ρ values for a given spin-lock power and offset combination were obtained by fitting the peak intensities (extracted using NMRPipe 53 ) as a function of delay time to a monoexponential function. B-M equations 55 were used to fit R 1ρ values for a given spin as a function of spin-lock power and offset to a two-state exchange model, with the uncertainties in the exchange parameters extracted using a Monte Carlo procedure 26 . Exchange parameters of interest, such as the population of the ES conformation (p ES ), the exchange rate between the GS and the ES (k ex = k forward + k reverse ), and the chemical shift difference between the ES and GS conformations (Δω ES-GS = ω ES - ω GS , in which ω ES and ω GS are the chemical shifts of the ES and GS, respectively) were extracted from the two-state exchange model. Global two-state fits of the R 1ρ RD data for multiple spins were performed by sharing p ES and k ex . The same R 2 and R 1 values were assumed for the GS and ES. Three-state fits with linear topology did not converge to a solution 56 . Triangular topologies could not be fit due to the lack of a probe that simultaneously detects exchange between the ESs. The fitted parameters are summarized in Supplementary Tables 6-10.
Off-resonance R 1ρ profiles were generated by plotting R 2 + R ex = (R 1ρ - R 1ρ cos 2 θ)/sin 2 θ, where θ is the angle between the effective field of the observed resonance and the z axis in radians as a function of Ω OBS = ω OBS - ω RF , in which ω OBS is the Larmor frequency of the observed resonance, and ω RF is the angular frequency of the applied spin-lock power. Errors in (R 2 + R ex ) were determined by propagating the error in R 1ρ through a Monte Carlo procedure.
The sensitivity of the R 1ρ data to p ES was examined by assessing the reduced χ 2 value when fixing p ES to a range of values tenfold above and below the best-fit p ES . For certain spins sensing slow conformational exchange (k ex < 250 s −1 ), the R 1ρ RD data could be fitted to a wide range of p ES and k ex values with nearly identical reduced χ 2 values due to slow-exchange degeneracy 25 . By contrast, the Δω ES-GS was well defined by the data.
The RD data were inconsistent with a linear topology (GS ⇌ ES1 ⇌ ES2) and were insensitive to the minor exchange kinetics between the two ESs 26,56,57 due to lack of a suitable RD probe simultaneously sensing both ESs 10,56 . In addition, it was not feasible to measure Article https://doi.org/10.1038/s41589-023-01306-5 the contribution of A anti + •G syn to the G-H1 CEST profile because of slow exchange kinetics under the low-temperature conditions needed to observe the G-H1 resonance (Extended Data Fig. 3).
The lack of RD at A-C2 is expected given that A syn •8OG anti is not protonated. In addition, based on B-M simulations, A-C8 is not expected to sense adenine isomerization because it is masked by the larger RD contribution due to 8OG isomerization. Once again, the RD data were insensitive to the minor exchange between the two ESs 10,56 due to the lack of an RD probe simultaneously sensitive to both ESs. 13 C Chemical exchange saturation transfer measurements. The 13 C CEST experiments were performed using a pulse sequence and a selective excitation scheme in a 1D manner similar to the 1D R 1ρ experiment 58 . The spin-lock powers used ranged from 10 to 50 Hz. The list of spin-lock power offset combinations used for the 13  in which k pol is the maximum nucleotide incorporation rate constant, and K d is the apparent equilibrium dissociation constant for the dNTP. It is important to note that the apparent K d does not necessarily reflect the true nucleotide equilibrium dissociation constant. Despite this, the alteration of the input apparent K d has a small effect on our predictions for k pol,incorrect . This analysis was used to simulate correct and incorrect incorporation, and the fitted k pol and K d were used to compute the misincorporation frequently/probability (F pol ):  100, 200, 300, 500, 750, 1,000 and 1,500 μM. It is assumed that the binary polymerase-template complex is preformed for both correct and incorrect incorporation models. The [dNTP] was varied by multiplying the concentration by the estimated diffusion-limited dNTP association rate constant (k 1 = 100 s −1 μM −1 ). The free [dNTP] was assumed to be constant over the course of the simulation.
Below is a description of all the parameters and system of differential equations used in the simulations. Rate constants are listed in Extended Data Fig. 7, and their values are listed in Supplementary  Table 11.

Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability
The NMR data generated in this study are included in the published article and the Supplementary Information file. Fig. 3 | Exchange parameters for the A•G mismatch measured using R 1ρ in hpGAC at 10 °C. Shown are the off-resonance 13 C R 1ρ profiles measured for A-C1ʹ, A-C8, A-C2, and G-C8 collected at 10 °C and pH 7.4 in NMR buffer as described in Methods. Spin-lock powers used for R 1ρ profiles are color-coded. Solid lines in the profiles denote the global 2-state fits to the data using B-M equations as described in Methods. Data for the R 1ρ profiles were

Extended Data
presented as values ± 1 s.d. from Monte Carlo simulations for one measurement as described in Methods. For the R 1ρ profiles corresponding to A-C1ʹ and A-C8, the ± 1 s.d. is smaller than the data points. These exchange measurements at 10 °C only sense A syn -G anti exchange (top) since A anti + -G syn exchange becomes too slow to have a substantial contribution (middle). Probes that sensed exchange at this condition are highlighted in green.