Assigning 13C NMR chemical shifts for the glucosyl residue environments in cellulose within native plant cell walls
Throughout our work31,45,49,50 never-dried native samples of many different plants have been studied using 13C MAS NMR to understand the structure and interactions of cellulose hemicellulose, lignin and water in plant cell walls. The composition of the secondary cell wall of poplar is relatively simple with only 3 main components cellulose, xylan and lignin. Figure 1 shows the 1D CP MAS NMR spectrum of the neutral carbohydrate region, which is dominated by the cellulose signal, of poplar wood. The carbon 1 (C1) peak is at ~ 105 ppm, C2, C3, C5 overlap in the region between 70–76 ppm and the C4 and C6 carbons are split into two main peaks. The cellulose glucose environments in domain 1 show C41 and C61 peaks at ~ 89 ppm and ~ 65 ppm respectively and the C42 and C62 domain 2 peaks are at ~ 84 ppm and ~ 62 ppm respectively. Interestingly the C4 peaks in the 1D CP MAS spectrum are not simple Lorentzian line shapes indicating there are multiple glucose environments within both domains. The C42 region especially is clearly split into two peaks at ~ 83.5 and ~ 84.5 ppm.
Fully 13C labelled poplar wood enabled the structure of plant cellulose to be explored using a wealth of 2D NMR experiments. 2D experiments such as CP INADEQUATE and 30 ms CP PDSD were used to resolve several different glucose environments within cellulose. The CP refocussed INADEQUATE is a double quantum experiment that correlates two covalently bonded carbons. The bonded carbons appear at the same DQ shift which is given by the sum of the respective SQ shift of the correlated carbons. Figure 2 shows the neutral carbohydrate region of the CP refocussed INADEQUATE spectrum of poplar wood. The high resolution means that the carbons within each glucosyl residue can be followed through in the 2D spectrum. This is demonstrated in red for the glucose residue in domain 1 named environment ‘c’, and in black for the C4-C6 region for the residue in domain 2 environment ‘j’. Our naming convention and assignments are based on those of Wang et al. 46 for cellulose in primary cell walls since many of the cellulose environments we observe have similar values, however glucose environment j, which is clearly visible for C4, C5 and C6, is newly reported. The C6 region of the spectrum (inset in Fig. 2) shows the main glucose environments identified within domain 1 and domain 2. This region also shows a minor domain 2 environment, named k. In total there are three major glucose environments a, b, c in domain 1 and three f, g, j in domain 2.
To make a more complete assignment of the NMR shifts of the glucose environments within cellulose, a 30 ms CP PDSD spectrum was analysed alongside the INADEQUATE spectrum. The CP PDSD experiment is a through space correlation experiment which, with a mixing time of 30 ms, shows cross peaks of carbons only if they are within the same glucose ring. SI Fig. 1 shows the 30 ms CP PDSD spectrum highlighting the key regions which are particularly useful for identifying different glucose environments. Two further minor glucose environments in domain 1 labelled d and e (which is very minor) have previously been assigned in Wang et al.46. Table 1 lists the 13C NMR shifts for all the different glucose environments identified in cellulose, including the more minor ones.
Table 1
NMR chemical shifts of all glucose environments assigned in cellulose of poplar wood. *The naming convention and assignments are based on those of Wang et al.46 with additional environments. The minor glucose environments d, e and k are italicised. A comparison to Iα and Iβ cellulose NMR chemical shifts is shown in SI Table 1. $All 13C NMR chemical shifts are in ppm and have an error of ± 0.1 ppm.
Domain 1 |
Glucose Environment | C11 | C21 | C31 | C41 | C51 | C61 |
a* (Cellulose Iβ origin) | 105.8$ | 71.7 | 74.3 | 89.0 | 72.5 | 65.0 |
b | 105.2 | 72.6 | 75.5 | 89.1 | 72.6 | 65.2 |
c (Cellulose Iβ centre) | 104.1 | 71.8 | 75.2 | 88.1 | 71.2 | 65.8 |
d | 105.2 | 72.5 | 74.9 | 87.1 | 72.5 | 64.7 |
e | 105.0 | - | 74.7 | 89.8 | 71.1 | 65.3 |
Domain 2 |
Glucose Environment | C12 | C22 | C32 | C42 | C52 | C62 |
f | 105.2 | 72.4 | 74.4 | 84.5 | 75.3 | 62.4 |
g | 104.9 | 72.4 | 75.3 | 83.5 | 75.2 | 61.4 |
j | 105.1 | - | - | 83.3 | 73.9 | 61.3 |
k | - | - | - | 83.8 | 74.3 | 63.5 |
Glucose environments in cellulose of cell walls of different plants
Having identified all the glucose environments in cellulose of poplar wood we were interested to compare these to cellulose of other plants. Over the years we have used high resolution 2D NMR to study a wide range of different plants including monocots, eudicots, and gymnosperms.44,49,50 The cellulose of many of the plants investigated has remarkably similar NMR shifts such that it is possible to identify our assigned glucose environments in a wide range of plants. This is illustrated using a comparison of two different regions of the 30 ms CP PDSD spectra of a eudicot (poplar), a monocot (Brachypodium) and a gymnosperm (spruce) in Fig. 3 for the C1-C4 region and SI Fig. 2 for the C4-C6 region. The relative amount of each glucose environment varies between samples, for example, spruce wood appears to have relatively more b versus a or c whereas in poplar wood cellulose these are more similar in quantity. Brachypodium has significantly more of site d which is a minor cellulose site in the poplar wood secondary cell walls. Whilst not visible in Fig. 3 and SI Fig. 2, the cellulose environment e is a very minor environment in all three plants. There is one additional cellulose environment ‘s’ seen in the spectra of spruce. There are some small differences in the shifts for some of the more minor cellulose environments. For example, we found that environment d is generally broader than sites a, b, and c with its C4 NMR shift varying by ~ 0.3 ppm indicating that there are more variations in the d local environment. Generally, the NMR shifts remain similar between all the plants, indicating the glucose environments are constant in the cellulose fibrils.
Xylanase-treated holo-cellulose nanofibrils (hCNFs) maintain the native plant cellulose structure of poplar wood
Whilst studying the native plant cellulose in-situ is ideal for ensuring minimal disruption of the cell wall, the NMR spectrum can be crowded since signals from both hemicelluloses and lignin are present which tends to limit the resolution. The glucose environments in the fibrils may also be influenced by interactions of surface residues with hemicellulose or lignin. By removing the lignin during preparation of holo-cellulose nanofibrils (hCNFs) we can maintain the cellulose fibril structure whilst improving the resolution of the NMR spectrum.48 To remove hemicellulose we treated hCNFs produced from the poplar wood with xylanase. TEM images of these hCNFs show long and thin cellulose fibrils which are loosely bundled, as observed with the 3nm width hCNFs prepared similarly from Arabidopsis see SI Fig. 3.48 To investigate whether the xylanased hCNFs have maintained the cellulose environments we compared the NMR spectra of poplar wood with those of the xylanased hCNFs. Their 1D CP MAS spectra are shown in SI Fig. 4. There is a significant change in the total signal in the C4 region and in the ratio of domain 1 and domain 2 cellulose peaks. This change is due to the removal of both lignin and hemicelluloses as well perhaps as loss of a less ordered cellulose component.
The 2D 30 ms CP PDSD comparisons of the C1-C4 region and the C1-C6 region shown in Fig. 4 and SI Fig. 5 respectively show that the domain 1 glucose environments remain almost unchanged by the production of the hCNFs from wood. The domain 2 environments show some slight NMR shift changes, typically < 0.3 ppm, in both the f and g/j environments. As the domain 2 environments are surface chains of the fibril this may be due to the removal of both lignin and the hemicellulose xylan. The domain 2 cellulose region of the 30 ms CP PDSD spectrum shown in SI Fig. 5 exhibits distinctly narrower peaks for the xylanased hCNFs, i.e. a broad contribution of somewhat less ordered environments has been removed. Since there were no other changes, we are convinced that the production of the hCNFs causes relatively minimal disturbance to the different glucose environments in the poplar fibrils.
Plant cellulose fibril core environments are identical to tunicate cellulose Iβ
The improvement in the resolution of NMR spectra makes analysis of the xylanased hCNFs ideal for determining NMR shifts, quantities, and the relative location of the different glucose sites within the fibrils. There have been many 1D 13C NMR studies of cellulose Iα and Iβ, but the use of different referencing has led to a spread in the published NMR shifts.19,20,51,52 Recently Brouwer et.al. have resolved these discrepancies to give a consistent set of 13C NMR values.21 Using the same reference as Brouwer et al.21 we can compare our shifts from the high-resolution spectra of xylanased hCNFs from poplar wood with those of cellulose Iα and Iβ. None of the domain 2 glucose environments are close to those of the NMR shifts of glucose in these cellulose allomorphs (SI Table 1). This means we only need to consider the 13C NMR shifts of domain 1 glucose residues, where there are only 3 major sites a, b and c. Figure 5 shows the C4-C3/C5 region of the CP refocussed INADEQUATE spectrum of the xylanased hCNFs with the Iα and Iβ shift positions marked. It is evident that the shifts of cellulose Iα are different from those of native plant cellulose whereas the C3, C4 and C5 shifts of a and c are very close to those of Iβ. Indeed, all the shifts for sites a and c, apart from C1, are within ~ 0.2 ppm of those of cellulose Iβ (SI Table 1). Since the only substantial difference we observe from the Iβ shifts is in C1 where it seems likely that the Kono et al.19,20 assignment was incorrect, as highlighted in SI Fig. 6 and SI Table 1. This misassignment by Kono et al.20 presumably arose because a refocussed INADEQUATE spectrum was used for their assignment and both glucose environments in cellulose Iβ have nearly identical C2 NMR shifts making it difficult to distinguish the associated C1 shifts. Using the 30 ms CP PDSD spectrum (SI Fig. 6) together with the INADEQUATE spectrum allowed us to determine confidently the C1 shifts for environments a and c. In a later paper Brouwer and Mikolajewski also commented on the possibility that the Kono et al. C1 assignments could be swapped.53 With this new NMR shift assignment of cellulose Iβ, all the shifts of fibril core glucose environments a and c match closely tunicate cellulose Iβ. We now assign environment a as origin chain, and environment c as centre chain, because the DFT calculations from the cellulose Iβ predict the C1 chemical shift of the residues in the origin chains is ~ 2 ppm higher than the C1 of residues in the centre chains, as seen here for environments a and c respectively.46 Native plant cellulose is clearly not a mixture of cellulose Iα and Iβ since, despite environment b having similar shifts to one of the Iα glucose environments for both C1 and C4 (SI Table 2), there is no sign of the second Iα glucose environment which would also be visible in similar quantities within domain 1. This identification of glucose environments a and c now shows that the plant cellulose fibrils are cellulose Iβ.
Having determined that two of the main domain 1 environments correspond to glucose in the classical cellulose Iβ structure, we were interested to understand the origin of the third, remaining, domain 1 environment, b. Since b is from domain 1, it has been thought also to be interior to the fibril. To investigate this further, a 2D water-edited (w.e.) 30 ms CP PDSD spectrum was acquired to probe the glucose environments that are closest to water. Figure 6 shows the C1-C4 and C1-C6 regions of the w.e. spectrum compared with a standard 30 ms CP PDSD spectrum. As expected for surface glucose residues, the signal for the domain 2 sites f, g and j is enhanced indicating that they are more water accessible than the fibril core sites a and c. SI Fig. 7 shows similar proximity to water of these three major domain 2 glucose environments. Site j may be further from water than f and g, indicating these residues may be located on different surfaces of the fibril. Figure 6a shows that the w.e. C1-C4 peak for glucose environment b is also significantly enhanced in a similar way to f, g, and j, indicating that b is also closer to water than the core sites a or c. Glucose environment b of domain 1 is therefore, unexpectedly, in a surface chain of the fibril. Interestingly, for the C1-C6 region shown in Fig. 6b the signal from b is still enhanced in the w.e. experiment over the signal for the core a and c environments, but to a lesser extent than for the domain 2 surface glucose environments. Being in domain 1, environment b likely reflects a glucose residue in a tg conformation. This conformation may be because the C6 is facing toward the interior of the cellulose fibril where the hydroxymethyl group hydrogen bonds with other residues.16 Hence in the glucose residue of environment b the C6 is further from water than its C4, and further from water than the other surface environments where the C6 has a water-facing gt or gg conformation. Thus, in contrast to the widely accepted view, not all the domain 1 glucose residues are interior of the fibril. A consequence of this finding is that the ratio of domain 1 and domain 2 signals cannot be used to estimate the size of a fibril since when site b is included in domain 1 the amount of interior will be overestimated, and surface underestimated. The ratio of interior vs surface of the fibril can however be estimated from the C1 region of the INADEQUATE spectrum where sites a and c are resolved from b, and interior a + c and surface D2 + b can now be determined by integration (see SI Fig. 8). The value of ~ 0.5 in the poplar hCNFs is consistent with an 18-chain fibril with 6 interior chains and 12 surface chains. The relative amount of site b is ~ 0.7 of sites a or c (see SI Fig. 8).
To investigate the relative proximities of the different glucose environments, two longer (200 ms and 400 ms) mixing time CP PDSD experiments were acquired. As these experiments are probing distances up to ~ 5–8 Å, cross-peaks are between glucose residues in different sites within individual fibrils are additionally observed. Figure 7 shows that at a mixing time of 200 ms the C1-C1 region of the CP PDSD spectrum gives clear cross peaks between a and c only. This shows that the residue environments a and c are closer to each other than either are to residue environment b. This is consistent with our finding that b is not situated with a and c in the core of the cellulose fibril. The first clear cross peaks between residues in the two cellulose domains can be seen in the C4-C4 region of the 400 ms CP PDSD (Fig. 8a). There are cross peaks from domain 2 glucose environments to the single C41 peak corresponding to glucose residue environments a plus b. Given we know that environments a and c are particularly close, and there are no cross peaks to c, the cross peaks from domain 2 observed here are likely to environment b only. This proximity of b and not c to domain 2 glucose residues is also observed in the C6-C6 region of the 400 ms CP PDSD spectrum (Fig. 8b). There are also cross peaks between the different domain 2 environments of cellulose. The proximities of b to the domain 2 environments provides further confirmation that b is one of the four major surface glucose residue environments.