Accelerated molecular dynamics
Accelerated molecular dynamics (aMD) is an enhanced sampling technique that flattens the potential energy surface by introducing a non-negative boost potential, by which one can observe the long-time behaviors of simulation systems with limited computational time (Hamelberg et al. 2004; Hamelberg et al. 2007). Figure 1(a) shows the ab base planes of the two crystal models, which consist of 24 and 48 cellulose chains with DP = 40, denoted as the 40mer × 24 and 40mer × 48 models, respectively. Both crystal models share the (010) lattice planes of the same length, whereas the length of the (100) lattice plane of the former is doubled in the latter. The 40mer × 24 was solvated in 50,000 TIP3P water molecules (Jorgensen 1981; Jorgensen et al. 1983) containing 546 EDA molecules, and in 90,000 water molecules and 1,365 EDA molecules for the 40mer × 48 model.
The present aMD simulations adopted two acceleration levels, termed as a dual boost, which applied a boost potential to both dihedral and total potentials. The aMD parameters, the threshold boost energy, E, and the acceleration factor, a, are defined for each potential according to the Amber Tutorial (Pierce et al. 2012; Salomon-Ferrer et al. 2013b; Miao et al. 2014), following:

where Nresidues is the number of solute residues including EDA, Natoms is the total number of atoms, and Vdihedral_avrg and Vtotal_avrg are the average dihedral and total potential energies from conventional MD calculations, respectively.
As described in the next section, many EDA molecules are released from the crystal models in the early stage of the aMD calculations, which fills a solvent box with desorbed EDA molecules that should be removed from the estimation of the aMD parameters. The present aMD simulations were carried out by the multi-cycle procedure involving a solvent displacement step accompanied by re-estimation of the aMD parameters, following:
- Equilibration step. The simulation system, the cellulose I–EDA complex crystal model with water molecules, is equilibrated by the standard procedure described below. The final part of the procedure involves the conventional MD run for 1 ns for estimation of the aMD parameter.
- aMD calculations. The equilibrated system is subject to aMD calculations for 10 ns with the aMD parameters determined based on the conventional MD trajectory performed in the previous cycle, using the NPT ensemble at a constant temperature of 300 K or 370 K and pressure of 1 bar. The number of EDA molecules remaining in the crystal model is counted for each 1-ns aMD run.
- Displacement of the solvent box. A cellulose crystal accommodating the interior EDA molecules and water molecules, and the water molecules attached to the crystal surface are extracted from the final aMD structure. A new solvate box system is built for the crystal model extracted, where the number of water molecules is adjusted to be same as the original number. The new simulation system is then moved to the equilibration step of the next cycle.
Adaptive steered molecular dynamics
Steered molecular dynamics (SMD) uses a pseudo particle that applies a steering force to analyze the physical movement of atoms and molecules along a reaction coordinate at a particular velocity (Park et al. 2003; Park and Schulten 2004). SMD, in combination with the Jarzynski’s non-equilibrium work relation (Jarzynski 1997), predicts the potential of mean force (PMF), which is derived from the exponential average of numbers of non-equilibrium works. Adaptive steered molecular dynamics (ASMD) is an improved method of SMD for effective sampling, where the reaction coordinate is divided into smaller segments in which the work distribution becomes narrower (Ozer et al. 2010; Ozer et al. 2012). The 16 cello-octamers (8mer × 16 model) model was adopted in the ASMD shown in Fig. 1(b). From the crystal surface, four EDA molecules were removed from the central hydrophilic channel to produce a hole for an EDA migration path. The model was termed as the one-hole model and, in a similar fashion, the two- and three-hole models were composed by digging an additional hole on one side or both sides of the original hole, respectively. The one-hole model was additionally modified by expanding the space of the migration path, such that both crystal model halves (the 8mer × 8 units) were slid outward along the a-axis for a given distance. The model was termed as the gap model. All the crystal models were solvated in 11,000 to 15,748 TIP3P water molecules (Jorgensen 1981; Jorgensen et al. 1983) and the one-, two-, and three-hole models contained 45, 40, and 35 EDA molecules, respectively.
The EDA molecule at the bottom of the hole, which occupied the fifth EDA layer from the crystal surface, was assigned to be the target molecule to be pulled along the hydrophilic channel. The reaction coordinate was defined to connect the center of the target EDA with that of the opening of the hole at the crystal surface.
A harmonic potential with a spring constant of 71.5 kcal/(mol∙Å2) was imposed on the center of the target EDA, which traveled at the constant pulling velocity of 0.5 Å/ns for 4 Å, which corresponded to the spacing of the cellulose layers. PMF, or the free energy difference, was determined from the ensemble average of the work, W, done by 50 SMD runs by the following Jarzynski’s equality (Jarzynski 1997):

where β = 1/(kBT), kB is Boltzmann’s constant, and T is the temperature. At the switching point of the SMD stages, a single trajectory was selected according to that whose work value was the closest to the PMF of the current SMD stage, and the structure at the final frame of the selected trajectory was used as the starting structure of the next SMD stage. The four SMD stages covered the overall distance of the migration path of 16 Å in the hydrophilic channel; meanwhile, the target EDA traveled through the four cellulose layers until it reached the crystal surface. The SMD calculations were performed using the NVT ensemble at a constant temperature of 300 K. Throughout the ASMD calculations, the positions of all backbone atoms of a cellulose chain, C1, C2, C3, C4, C5, and O5, and of the two carbon atoms of the EDA molecules, except those of the target EDA, were weakly constrained with a force constant of 0.1 kcal/(mol∙Å2) to prevent the crystal model from irreversible distortion resulting from EDA movement.
Umbrella sampling method
In contrast to the SMD and ASMD methods with a non-equilibrium basis, an umbrella sampling (US) simulation is a classified equilibrium molecular dynamics method (Torrie and Valleau 1974; Torrie and Valleau 1977), although it is also used to calculate PMF or a free energy change along a given reaction path, where biased potentials are applied to an atom or molecule at certain closely spaced intervals along the path. In the present US simulations, the PMF trajectory obtained by the SMD calculations was adopted as a reaction coordinate of which the overall length, 16 Å, was divided into 0.01 Å intervals, which gave 1,600 umbrella windows. At each interval, a harmonic potential with a force constant of 100 kcal/(mol∙Å2) was applied to the target EDA molecule and the NPT production run was performed for 1 ns at a constant temperature of 300 K and pressure of 1 bar. As was performed in the present ASMD, all of the backbone atoms of cellulose and the carbon atoms of all EDA molecules, except the target EDA, were positionally constrained with a force constant of 0.1 kcal/(mol∙Å2). The biased EDA distributions obtained from the production runs were unbiased and combined using the weighted histogram analysis method (WHAM) to obtain a PMF profile (Kumar et al. 1992; Kumar et al. 1995; Roux 1995).
General MD procedures and parameters
All energy minimizations and MD calculations of the solvated cellulose I–EDA complex systems were carried out using the PMEMD and PMEMD.CUDA modules (Le Grand et al. 2013; Salomon-Ferrer et al. 2013a) of the Amber 18 program package (Case et al. 2018) with an NVIDIA Kepler GPU system. The equilibration procedure of the system was composed by two-step energy minimizations: optimization of water configurations and full optimization of the whole system, and the gradual heating of the system from 100 to 300 K for 500 ps using the NVT ensemble. The density of the system was then equilibrated by the NPT run at a constant temperature of 300 K and pressure of 1 bar for, typically, 1 ns. In the first energy minimization and the heating runs, the positions of the solute molecules, cellulose chains, and EDA molecules were constrained with a force constant of 50 kcal/(mol∙Å2). In the ASMD calculations, the density equilibrating NPT runs were extended to 50 ns until the holes in the hydrophilic channels had filled with waters with the positional constraint. The temperature and pressure of the system were regulated by a Langevin thermostat (Hoover et al. 1982; Evans 1983) with a collision frequency of 5 ps−1 and a Berendsen barostat (Berendsen et al. 1984), respectively. A time step of 2.0 fs, enabled by the SHAKE option (Ryckaert et al. 1977), was employed to ensure rigid hydrogen atoms. The particle mesh Ewald (PME) method (Essmann et al. 1995) was adopted for long-range interactions and the van der Waals interactions were cut off at 10 Å. The three extended ensemble MD options for aMD, SMD, and US, implemented by the AMBER package, were used.
Cellulose molecules were described by the GLYCAM06 force field (Kirschner et al. 2008) and EDA molecules were modeled using the revised GAFF parameters (Wang et al. 2004) developed by us (Uto et al. 2019).
The MD trajectories were analyzed using the CPPTRAJ module (Roe and Cheatham 2013) of the AmberTools 19 and, for a visual analysis, VMD 1.9.3 (Humphrey et al. 1996). US–PMF profiles were calculated using the WHAM code ver. 2.0.10 written by Grossfield Lab. (Grossfield 2018). The molecular graphics software, PyMOL ver. 2.5.0 (Schrödinger LLC. 2021), was also used for visualizing the system.