Molecular and thermodynamic insights into interfacial interactions between collagen and cellulose investigated by molecular dynamics simulation and umbrella sampling

Cellulose/collagen composites have been widely used in biomedicine and tissue engineering. Interfacial interactions are crucial in determining the final properties of cellulose/collagen composite. Molecular dynamics simulations were carried out to gain insights into the interactions between cellulose and collagen. It has been found that the structure of collagen remained intact during adsorption. The results derived from umbrella sampling showed that (110) and (11¯0\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1\bar{1}0$$\end{document}) faces exhibited the strongest affinity with collagen (100) face came the second and (010) the last, which could be attributed to the surface roughness and hydrogen-bonding linkers involved water molecules. Cellulose planes with flat surfaces and the capability to form hydrogen-bonding linkers produce stronger affinity with collagen. The occupancy of hydrogen bonds formed between cellulose and collagen was low and not significantly contributive to the binding affinity. These findings provided insights into the interactions between cellulose and collagen at the molecular level, which may guide the design and fabrication of cellulose/collagen composites.


Introduction
Cellulose is a non-branched natural polymer composed of repeating glucose units (C 6 H 10 O 5 ) [1] and is considered as the most abundant natural polymer in the biosphere.
Huaiqin Ma and Qingwen Shi have contributed equally to this work.
* Zhijian Li zjli@sust.edu.cn 1 3 Cellulose naturally has the advantages including low density, low price, as well as biodegradability. Functionality, flexibility, and high specific strength of cellulose have further been developed by exploiting its hierarchical structure that covers from nanoscale to macroscale in various forms including nanocrystalline, aggregates, fibrils and so on [2,3]. The application of cellulose as renewable and biodegradable raw material in various fields is a proposed solution to the recent industrial challenge to successfully meet environmental and recycling criteria. Traditionally, cellulosic materials have been used in industries for developing paper and textile. While in the last decades, nanostructures obtained by disintegration of cellulose fibers yielding nano-or microfibrillated cellulose and cellulose whiskers laid the foundation for the development of novel materials with extraordinary properties. Nowadays, cellulosic materials have been used for a variety of applications, such as wound treatment [4], tissue engineering [5], drug delivery [6], energy storage systems [7], sensors [8] and biosensors [9]. Collagen is one of the most abundant and widely distributed proteins in the human body [10]. collagens consist of three left-handed polyproline II-like helices, which coil about each other with a one-residue stagger to form a right-handed triple helix (Fig. 1). It is the main component of tendons, bone tissue, skin, ligaments, corneas and many interstitial connective tissues [11]. Collagen can also be a candidate for biomaterials such as tissue-engineered scaffolds and wound dressings [10,12,13]. However, the application of pure collagen materials is limited due to their low water resistance, fast biodegradation perishability, and poor thermal stability [13], while cellulose/collagen composite materials overcome the weaknesses of pure collagen materials. Researchers [14] have proved that cellulose/collagen composite has water resistance, strength and stability better than pure collagen. Animal experimental studies of cellulose/collagen nanofiber hydrogel scaffold showed that the composite scaffold had good porous structure and physical stability and could be used in bone tissue engineering [5]. It also had good biocompatibility and was conducive to cell adhesion and growth [15], demonstrating that collagen and nanocellulose composite is a promising material for wound dressings and tissue engineering scaffolds. The incorporation of cellulose is a convenient and promising way to reinforce collagen without impairing biocompatibility and biodegradability, which show great promise as cost-effective forward-looking materials for biomedical applications.
As potential biomedical material, the practical application values of cellulose/collagen composites are largely dictated by their physicochemical properties. For example, scaffolds must have sufficient mechanical strength such as tensile strength, yield strength and elastic modulus to maintain their structural integrity in the process of during transport, surgical handling and implantation. Engineered scaffolds should typically mimic the mechanical properties of their target tissue and therefore controllable regulation of mechanical properties is essential when confronting various tissues with distinctive mechanical features such as bone, cartilage scaffolds. Furthermore, density, melting point, and water absorption rate of composite also affect the application of biomaterials. These properties of composites are closely relevant to interfacial interactions [16,17], which largely determined the macroscopic physical properties of materials. Therefore, the elucidation of collagen and cellulose interactions is crucially important to instruct the design and optimization of composite materials. In addition to interfacial interactions, the effects of cellulose on the structure of collagen also impact the properties of composite materials. Therefore, it is necessary to study the interactions between cellulose and collagen to further improve the performance of the composite material.
The interaction between cellulose and protein complex is very complex, including hydration, cross-linking, covalent and non-covalent bonds between components [18]. The majority of studies investigated the interactions between cellulose and protein and focused on cellulose binding modules (CBMs). Several studies have established that three aromatic residues on a CBM surface were critical for binding onto cellulose crystals and that tryptophans contributed more to binding affinity than tyrosines [19,20]. CBMs have welldefined binding sites and tend to bind on hydrophobic (110) plane of Iα crystal [21]. The binding forces between different cellulosic nanomaterials and cellulose binding modules family belonging to cellulase have been investigated by single molecule force spectroscopy. It has been found that CBM1 had a similar interaction force on the surfaces of unmodified cellulose regardless of the degree of crystallinity and morphology [22]. Force spectroscopy measurements and molecular dynamics simulations indicated that CBM1 displayed lower binding affinity toward cellulose III [23]. However, only a few studies focused on the interactions between cellulose and collagen. Two-dimensional Fourier transform infrared spectroscopy analysis confirmed the hydrogen bonding and electrostatic interactions between succinylated collagen and carboxymethyl cellulose [24]. Although there have been experimental studies on the interaction between collagen and cellulose, the specific interaction and structural characteristics of collagen in the matrix has not been clearly stated [25]. The molecular-level interaction between cellulose and collagen is still obscure, which may limit the rational development of new cellulose/collagen composites with ideal physical and chemical properties.
Native crystalline cellulose contains a mixture of crystal faces [26], which complicates the direct assignment of cellulose-collagen interactions to a specific face by experimental methods such as NMR spectroscopy. Molecular dynamics simulations (MD) provide a good solution to bypass this limitation and a single crystal face in nanoscale can be constructed by molecular modeling. Molecular dynamics simulations have been carried out by Crowley et al. to study the interactions between cellulose and lignin and gained insights into quantitative relationships between different cellulose faces and specific lignin chemistries [27]. Molecular dynamics simulations have also been used to probe the interactions between water molecules and cellulose, which shed light on the wetting mechanisms of cellulose [28]. Herein, molecular dynamics simulations were carried out in this study to investigate the interactions between collagen and four crystal planes of cellulose.
In this work, Ιβ-cellulose, which is one of the main components of higher plants [29], was selected to model cellulose. The adsorption of collagen on different crystal planes of cellulose was simulated by molecular dynamics. Considering the different molecular structures and hydrophilicity of the exposed surfaces of cellulose, non-polar (100), polar ( 110 ), (110) and (010) with distinct roughness have been studied. The binding energy are evaluated by umbrella sampling and the key factors driving the combination of cellulose and collagen are discussed, which provide molecular insights into the interfacial interactions and gain insights into the possible strategies to improve the performance of cellulose/collagen composites.

Methods
The dominant natural crystalline cellulose polymorph exhibits four potential crystalline faces ((100), (110), ( 110 ) and (010)) with significant surface area (Fig. 1A). The four crystal facets of cellulose have been widely used in theoretical studies to investigate the interfacial interactions between cellulose and other molecules [27,28,30,31]. It is feasible to employ the four facets to probe the molecular interactions between cellulose and collagen. The crystalline cellulose faces were built based on an initial cellulose Iβ crystal constructed by Cellulose-Builder [32]. The number of repeating units in each face was chosen such that the size of cellulose could accommodate the collagen. The thickness of cellulose sheets ranged from 2.0 to 4.0 nm in order to keep the core structure stable in the process of simulation.
The initial structure of collagen was obtained by extracting three chains from the crystal structure (PDB code 1K6F) [33] as shown in Fig. 1B, which is the most investigated model in structural studies of collagen. The length of the collagen model used in the system is about 85 nm and a triple helix. In this study, we chose the (Pro-Pro-Gly) 10 as collagen model, which is the most and first widely investigated. Furthermore, the structure has been frequently used as theoretical model to study the structural characteristics of type II collagen [34][35][36]. Collagen was placed above the surface of the cellulose crystal in different orientation with the angles between collagen main axis and cellulose plane being 0º, 30º and 45º respectively (supporting information Fig. S1). VMD [37] was used to build all the models and 12 different systems were produced with the minimum distances between collagen and cellulose face ranging from 0.8 to 1.0 nm ( Fig. 2A). The distinctive cellulose-collagen orientations and distances eliminated the bias of initial structures and excludes the effects of orientations on the collagencellulose interactions. The composite system was solvated in a cubic box with a TIP3P water model [38,39] and modeled by a CHARMM36 force field [39,40].
All MD simulations were performed in GROMACS-5.1 package [41,42]. All the systems were equilibrated carefully in the beginning of simulation. The energy minimization process was carried out with 1000 cycles of steepest descent and 1000 cycles of conjugate gradient minimization. Then, equilibration runs were performed for 5 ns in the NVT ensemble and 5 ns in the NPT ensemble with protein backbone fixed. Finally, 500 ns production runs were simulated in the NPT ensemble with protein released. The longrange electrostatic interactions were treated by the particle mesh Ewald (PME) method [43], while the short-range van der Waals interactions were calculated with a cutoff distance of 1.0 nm. All covalent bonds containing hydrogen atoms were constrained by the LINCS algorithm [44]. The V-rescale thermostatic [45] was used to heat the system to 300 K and the Parrinello-Rahman Pressure coupling [46,47] kept the system pressure at 1 bar. The integration step size of the simulation process is 2 fs. Cellulose, which exists as high polymer in nature, is nearly infinitely long in the direction of glucosidic bonds. Therefore, we set up PBC where the glucose in one end is covalently linked to the glucose in the other end by forming β1-4 glucosidic bond through a periodic box, which is consistent with the actual situation. Therefore, periodic boundary conditions were applied in all directions with covalent glycosidic bonds formed between mirror images, which was generated by Cellulose-Builder automatically [32].
The potential of mean force [48] (PMF) obtained by pulling simulation and umbrella sampling [49] was used to calculate the binding free energy of the system. The cellulose surface was used as a reference point and a harmonic potential was applied to the collagen as a pulling point. The last frame of the MD simulations was selected as the initial conformation, 300 ps umbrella traction was provided for collagen along the z-axis to increase the center of mass (COM) distance between collagen and cellulose. The spring constant used was 2000 kJ mol −1 nm −2 and the pull rate was 0.01 nm/ps. More than 13 umbrella sampling windows were selected according to the interval size of COM values. 1 ns pressure equilibrium was performed on each sample, then 10 ns of dynamic simulation process was carried out. Finally, weighted histogram analysis [49] (WHAM) was used to calculate PMF. The thicknesses of the four crystal planes are different, then the reaction coordinates of PMF were adjusted by deducting a half of thickness value accordingly. The reaction coordinates of PMF were the distance between the surfaces of crystal planes and collagen, which facilitated the investigation of interfacial interactions between cellulose crystal planes and collagen.
The relevant modules in GROMACS were used to calculate the backbone root mean square deviation (RMSD), backbone root mean square fluctuation (RMSF) and radius of gyration (Rg) of proteins during the whole simulation process. When the distance between two atoms is less than 0.5 nm, the two atoms were in contact. The number of hydrogen bonds was calculated with the Donor-Acceptor distance 0.30 nm and Hydrogen-Donor-Acceptor 30° as criterial. VMD was employed to render all the structures.

Structural stability of collagen on the four faces of cellulose
The (100) face is hydrophobic and the surface is unable to form significant hydrogen bonds with other molecules due to the orientation of cellulose hydroxyl groups. The (110), ( 110 ) faces and the (010) face are hydrophilic with abundant hydrogen bond interaction sites [50]. Though the hydrophobicity of each face was distinctive, the collagens were all adsorbed on the surfaces of cellulose. Figure 2B-E exhibit the last snapshots of MD simulations, in which all The RMSD of collagen backbone was calculated to quantitatively measure the change of collagen structure during the simulation. RMSD values of collagen on different cellulose crystal faces are kept between 0.2 and 0.3 nm as shown in Fig. 3A-D. The profiles of RMSD exhibit very small fluctuation during the whole process. RMSD values in this range indicate that the protein structures remain stable during the adsorption process. Furthermore, RMSF was calculated to evaluate the free movement degree of each residue in collagen molecules. As shown in Fig. 3E-H, the profiles of RMSF have good accordance with each other regardless of the properties of different cellulose faces. Each chain of the collagen model is composed of 29 residues and the peak of the line represents the end of each chain, which indicates that the two ends of the polypeptide chain are more flexible and the structure of other residues located in the middle of collagen is stable. The overall structures of collagen are not damaged during the process of adsorption on different crystal faces of cellulose.
To analyze the structural changes of collagen in the process of adsorption in more detail, the Ramachandran plot [51] was employed to characterize the changes in the secondary structure of proteins. Collagen is a coil, but one with distinct tertiary and quaternary structures: three separate polyproline II polypeptides, called α chains are supertwisted about each other. The superhelical twisting is right-handed in collagen, opposite in sense to the left-handed helix of the α chains. Thus, it is ambiguous to evaluate the secondary structure of collagen by designating the structure helix, sheet, or coil. Therefore, Ramachandran plots were employed. The collagen helix is a unique secondary structure with Phi = -51°and Psi = + 153°, which is quite distinct from the α helix [52]. As shown in Fig. 4, Ramachandran plots of the last frames extracted from the three MD simulations display similar characteristics and most of Phi and Psi values are confined within the region corresponding to the structure of collagen. It is found by comparison that the Phi and Psi angles of collagen not significantly deviated ( Fig. 4A-D) from the specified collagen conformation values during the adsorption of collagen in different crystal faces of cellulose. Therefore, it is indicated that all models exhibit partial structural interruption but are not damaged during adsorption.
To further investigate the effects of cellulose on the global structure of collagen, free energy contour maps were constructed with RMSD and Rg as reaction coordinates. As shown in Fig. 5, the free energy contour values of each model are located in a similar region with only one global minimum, which indicates that the effects of different crystal face on collagen structures are too weak to induce obvious changes. All the global minima of the 12 MD simulations are restricted within narrow ranges with RMSD about 0.23 nm and Rg about 0.42 nm, which further validates the intactness of collagen on the surface of cellulose. The conformational space of collagen on the surface of cellulose

Interactions between collagen and celluloses during adsorption
To understand the adsorption of collagen onto cellulose, umbrella sampling simulations were performed for all the four faces. Prior to the umbrella sampling simulations, the centers of all the cellulose models move to original points. The profiles of umbrella windows have been provided and exhibited in supporting information Fig. S2. As shown in Fig. 6, the lowest PMF values are found near the layer surface for all the systems, indicating that collagen tends to bind to cellulose. PMF shows that free energies are close for faces (110) and ( 110 ) with values of − 14.9 kcal/mol and − 14.7 kcal/mol respectively, which imply almost the same adsorption strength of collagen onto the (110) and ( 110 ) surface. Free energy of (100) (− 9.5 kcal/mol) is a little higher than those of (110) and ( 110 ) faces, indicating that the affinity between (100) layer and collagen is weaker. In contrast with PMF profiles of (110), ( 110 ), and (100), the binding energy of (010) is the largest (− 4.4 kcal/mol).
These free energies derived from umbrella samplings indicate that collagen tends to migrate toward all the faces of cellulose. In particular, (110) and ( 110 ) faces displaying the strongest affinity to collagen.
To characterize the binding process of collagen on cellulose, we calculated the number of collagen heavy atoms that are within 0.5 nm to the four crystal faces over time. As depicted in Fig. 7A, the contact profiles of the three (100) replicas show commendable consistency with no significant changes after 200 ns. Similarly, two stages of contact numbers are also observed (Fig. 7G) when collagen binds to (010) face: an initial stage with contact numbers ranging from 0 to about 180 (0-100 ns) and a stable final stage with contact number fluctuating around 180. As for (110) and ( 110 ) face, the profile of contact numbers varies but all the collagens form stable binding with cellulose after 250 ns. In general, collagen adsorbed onto cellulose spontaneously and form stable combination with cellulose. Due to the structural anisotropy of cellulose, the four faces exhibit different values of contact numbers in the last stages.
In order to further clarify the molecular mechanism underlying the difference of binding energy, the distributions of contact numbers have been calculated. As shown  (110) and ( 110 ) faces are similar but move to the right in contrast with that of (010) face, which indicate more intimate contact with cellulose. (100) face exhibit the largest contact number among the four faces. The origin of this this phenomenon can be attributed to the topography of cellulose crystal faces. The (010) surface is saw-toothed (Fig. 8D), while the (100) surface is almost flat (Fig. 8A), and the (010) surface is expected to have a roughness much higher than the other three surfaces. (110) and ( 110 ) faces exhibit nearly identical roughness (Fig. 8B, C), which is consistent with the similar distributions of contact numbers. Though previous studies showed that the (010) plane had the highest contact surface area [53,54], the volume of collagen is much larger than these of polylactic acid and oleic acid, which hinders the binding of collagen onto (010) face with a half of the hydroxyl grouping shielded. Furthermore, the side chains of Pro and Gly are rigid and fail to penetrate into the groves in (010) face (Fig. 8D).
There are highly prominent hydroxyl groups at the (010), (110) and ( 110 ) surfaces, and the three hydroxyl groups on the pyranose ring are located at the equatorial position of the ring. Therefore, (010), (110) and ( 110 ) have significant hydrophilic properties and polarity. The (100) surface  corresponds to the axial direction of the pyranose ring with hydrophobic C-H groups exposed to the surrounding medium and exhibits non-polarity. In the view of the large amount of hydroxyl groups and the polarity of Gly in collagen, electrostatic interactions may play an important role at the interface, particularly through hydrogen bonding. The number of hydrogen bonds between collagen and cellulose have been calculated. It has been found that even though there are a large number of hydroxyl groups in the surface of cellulose, few hydrogen bonds formed between cellulose and collagen are observed (Fig. 9). In terms of (010) face, one hydrogen bond occupancy is lower than 12% while two hydrogen bond occupancy is lower than 0.5%. (110) and ( 110 ) faces display similar tendency to form hydrogen bonds with collagen and form less than 8% and 6% one hydrogen bond occupancy, respectively. (100) face display the least hydrogen bond among all the faces with occupancy less than 3%. Hydrogen bonding interactions between cellulose and collagen are limited due to steric hindrance. The sidechains of proline contact with the surface of cellulose, which hinders the interactions between backbone of collagen and cellulose.
The results of hydrogen bonds interactions are different from the previous researches, which suggest that there are lots of hydrogen bonds so that they enhance the composite of cellulose and collagen [55,56]. The different criterial to determine the formation of hydrogen bonds may cause the discrepancy of hydrogen bonding interactions. Focusing particularly on the behavior of the high frequency N-H and O-H stretching vibrations, Rudisill et al. [55] examined hydrogen bonding interactions between collagen fibrils and cellulose by using FTIR spectroscopy. Zhang et al. [56] also confirmed the existence of hydrogen bonding interactions by FTIR spectra which further validated by differential scanning calorimetry (DSC) and dynamic mechanical analysis (DMA). In our study, hydrogen bonds were considered to form when the Donor-Acceptor distance was 0.30 nm and Hydrogen-Donor-Acceptor angle was 30°. Furthermore, the difference in hydrogen bonding interactions may attribute to the distinctive primary structures of collagen. The determining factors regulating hydrogen bonding interactions shed light on the selection of collagen and collagen with more hydroxyproline residues may perform better due to the fact that the abundance of hydroxy groups provide more sites to form hydrogen bonds.
It was interesting to note that the polar (110) and ( 110 ) faces formed hydrogen bonding linker with collagen participated by water molecules. As shown in Fig. 10C, water molecules between collagen and cellulose form hydrogen bonds with backbone carbonyl oxygen and hydroxyl group at the same time, connecting (110) face and collagen. The similar phenomenon has also been observed in the interface between ( 110 ) face and collagen with 3 water molecules involved in the network of hydrogen bonds. However, there is no hydrogen bonding linker observed in (100) and collagen. As shown in Fig. 10A, due to its hydrophobicity, few water molecules accumulate between collagen and (100) face, which prohibit the indirect connection through hydrogen bonding. It is intriguing to find that there is no hydrogen bond network connecting collagen and (010) face though the (010) face is hydrophilic. As shown in Fig. 10B, water molecules accumulate in the interface between (010) and collagen and most of water molecules are stuck in the deep groves. The steric confinement caused by the limited volumes of the groves restricts the freedom of water molecules and prohibits the formation of hydrogen bonds which require suitable distance and orientation.
MD simulation in this work was based on the idealized situation in which the adsorbed/interacting molecule interacts with perfect crystallographic planes defined by their Miller indices. In order to bridge the gap between ideal models and realistic situations, surface energy and Wulff shape were employed to further evaluate the contributions of the for crystallographic planes. As reported by Fularz et al. [57], the surface energies of (100), (010) and (110) crystal faces are 81 mJ/m 2 , 166 mJ/m 2 and 129 mJ/m 2 (Here surface free energies which considered the entropy changes and was closer to real situations were selected). Due to the similar structural features of (110) and ( 110 ), we used the values of (110) to represent the free energy of ( 110 ). Based on Gibbs Wulff theory that the length of a vector drawn normal to a crystal face h j will be proportional to its surface energy γ j [58][59][60]. The linear growth rate of a crystal face is proportional to the specific surface free energy. For nanoparticles, the crystal surface with low surface energy grows faster and will become the main exposed crystal surface of nanoparticles. Thus, the contributions of the different crystallographic planes can be weighted by surface energies of the four crystal facets. The weighted values of (100), (110), ( 110 ) and (010) are 0.36, 0.23, 0.23 and 0.18 respectively according to Gibbs Wulff theory. When the effects of surface energy on the occurrence of such planes in cellulose crystals were considered, the binding energies between collagen and (100), (110), ( 110 ) and (010) faces are − 3.42 kcal/ mol, − 3.43 kcal/mol, − 3.38 kcal/mol and − 0.80 kcal/mol respectively, which indicated that the contributions of (100), (110) and ( 110 ) crystallographic planes are similar in collagen-cellulose binding.
In realistic situation, the contributions of (100), (110) and ( 110 ) crystallographic planes are similar due to the different areas of exposed crystal faces. However, the polar (110) and ( 110 ) surfaces had the strongest affinity with collagen. Although there was a minimum number of hydroxyl groups on the (100) surface, the interaction of the surface C-H moiety with collagen can achieve a larger van der Waals interactions with the smoothest surface. The (010) face exhibits the weakest interactions with collagen. The affinity can be attributed to the cellulose surface structure and the corresponding contribution of indirect hydrogen bonding interactions. Direct hydrogen bonding interaction is not significantly contributive to the affinity. Selecting proper collagen with more hydroxyproline may promote hydrogen bonding interactions. Furthermore, surface modification to increase the polarity of cellulose may also be feasible to enhance the interactions between cellulose and collagen, which has been validated by previous study [61]. Adding crosslinking agent playing similar roles to water molecules is also a possible strategy to improve the property of cellulose/collagen composite.

Conclusion
The interfacial structure between cellulose and collagen is one of the key factors regulating the performance of composites. In order to better understand the composite material at the molecular level and control the performance of the composite material rationally, MD simulations were carried out to investigate the interactions between cellulose and collagen. It has been observed that collagen was adsorbed onto the surface of cellulose without structural damage. (110) and ( 110 ) crystal faces exhibited the strongest affinity with collagen compared to the other two faces, which was attributed to combined effects of smooth surface and hydrogen bonding network involved water molecules. Van der Waals interactions and crosslinking effects of hydrogen bonding network codetermined the affinity between cellulose and collagen. Direct hydrogen bonding interactions were not the predominant force driving the binding with low occupancy. This study can provide insights into the interface properties of cellulose and collagen at the molecular level, which may shed light on the rational design of cellulose/collagen composite and promote the applications of the biomaterial.