Bringing Chemical Structures to Life with Augmented Reality, Machine Learning and Quantum Chemistry

Visualizing 3D molecular structures is crucial to understanding and predicting their chemical behavior. However, static 2D hand-drawn skeletal structures remain the preferred method of chemical communication. Here, we combine cutting-edge technologies in augmented reality (AR), machine learning, and computational chemistry to develop MolAR, a mobile application for visualizing molecules in AR directly from their hand-drawn chemical structures. Users can also visualize any molecule or protein directly from its name or PDB ID, and compute chemical properties in real time via quantum chemistry cloud computing. MolAR provides an easily accessible platform for the scientic community to visualize and interact with 3D molecular structures in an immersive and engaging way.


Main Text
In 1953 James Watson and Francis Crick proposed the double-helix model as the three-dimensional structure of DNA. They assembled a structure using a set of cardboard cutouts representing the different chemical components. 1 Although quite rudimentary, this visualization tool allowed them to observe how the complementary base pairs t together to form the structure of a double helix.
This example showcases the importance of visualizing the 3D spatial arrangement of molecules in understanding their chemical behavior. Since Watson and Crick's discovery, 3D visualization tools have made giant strides driven by technological advances. Soon, chemistry modeling kits were widespread in schools and labs; this allowed chemists to build molecules with spheres and sticks, see the bond angles and molecular shape, and feel which bonds can bend or twist. 2 Nevertheless, building a molecule from scratch during a short lecture can be impractical, while building a biomolecule requires specialized skills.
The emergence of graphical modeling software allowed scientists to image and interact with the 3D arrangement of atoms within molecules. 3 Visualization of dynamic trajectories offered a further advance, allowing the evolution of chemical mechanisms to be observed at an atomistic level.
The growing development of extended reality technologies promises a new era for highly immersive molecular visualization. 4 Augmented reality (AR), which superimposes computer-generated content onto real-world scenes, has become popular in a wide range of applications. It is now supported by most smartphones and tablets and soon will be supported natively in web browsers. 5 Science can bene t from the realistic and immersive nature of this tool to visualize 3D molecular structures and their chemical properties in the real world.
Recently, a number of applications have been developed for viewing molecules in AR, primarily for chemical education, and a comprehensive overview of the latest applications has been provided. 6 Currently, most of these applications require the use of ducial markers, such as QR code or speci c patterns printed on a card, in order to place the 3D model and track its location, [7][8][9][10][11][12][13][14][15][16] or the use of specialized hardware, such as AR glasses or head-mounted displays. [17][18][19][20] Coster developed an app that can recognize a limited number of chemical structures on which it had been previously trained. 21 Others require the 3D model les of the molecules to be created and pre-loaded into the app prior to the recognition, thus limiting the generalizability of the tools. 22 To overcome these limitations, we combine cutting-edge technologies in machine learning, augmented reality, and computational chemistry to develop MolAR, a mobile application which allows molecules to be directly visualized in AR together with their electronic features from their hand-drawn structure. The app does not require the use of markers, printouts, or specialized hardware, and there is no need to prepare 3D model les or pre-register molecules. Molecules can be input from their common/IUPAC name, SMILES string, PDB ID or directly from a picture of their chemical structure. Indeed, skeletal chemical structures continue to be the primary language for chemical communication due to their simple and intuitive nature. Once a molecule is displayed, its chemical properties such as frontier molecular orbitals and dipole moment can be calculated and visualized in real time. The combination of these technologies allows molecules to come to life starting from their chemical structure, paving the way for the modernization of scienti c education.

Results And Discussion
MolAR is an iOS application that can be downloaded for free from Apple's App Store. 23 Figure 1 outlines the main functionality of the app, showing the structure and object recognition, connection to public databases for protein and molecule visualization, and calculation of quantum mechanical properties. The sections below outline each of the app's functionalities and their signi cance in research and educational settings.
Building upon the eld of optical chemical structure recognition, we recently developed ChemPix, a software package to recognize hand-drawn hydrocarbon structures using deep learning. 24 The algorithm handles structures with wobbly lines, gaps, uneven bond angles, background noise, and shadows. More recently, Mathpix released a hand-drawn chemical structure recognition tool which can digitize molecules with heteroatoms. 25 MolAR employs Mathpix, allowing users to photograph a chemical structure on a piece of paper or a whiteboard and bring it to life in AR. The 3D molecule appears above the structure, and users can perform the pinch gesture on the phone to scale it up or down or use touch and drag to rotate and translate the model. This allows users, for example, to take a picture of a 2D chemical structure in the textbook they are reading and visualize it as a 3D model in a matter of seconds. Chemical structures can also be drawn directly in the app for instances when pen and paper are not handy ( Figure  S1).
Users also have the option to "hunt" for molecules in common household objects. Objects are mapped to a characteristic molecule responsible for its avor, color, or smell. For example, when users take a picture of coffee, the 3D model of caffeine appears above it. This can serve to gamify the app, allowing interested users to grasp that objects in the real world are comprised of molecules which determine their properties. The feature is designed to develop their scienti c curiosity by actively learning about chemicals in a fun and engaging way.
Furthermore, users can view in AR any molecule on PubChem by typing its name or SMILES and any of the 180,000+ proteins and biomolecules in the Protein Data Bank by entering its PDB ID. The immersive interaction with a 3D life-size protein offers a unique way to inspect the structural elements, the active site, and the solvent accessible channels, helping to unveil the complex structure-function relationships occurring in large biomolecules. In addition to research applications, making visualization of molecule and protein structures accessible to the general public could aid in effectively communicating scienti c viewpoints. To help users who are unfamiliar with chemical structures learn about molecules, the app also has a gallery section where users can browse molecules and proteins of interest and visualize them in AR ( Figure S1).
In addition to structural visualization, MolAR allows users to compute and visualize a selection of electronic properties arising from a particular molecular arrangement. Those can be computed in real time through quantum mechanical calculations on TeraChem Cloud, a cloud-based, GPU-accelerated electronic structure package. 26 This allows users to visualize the frontier orbitals and the dipole moment vector for the chosen molecule in a matter of seconds. The ability to readily visualize molecular orbitals in AR directly from a hand-drawn structure is a powerful learning tool, particularly when teaching molecular orbital theory. It is also useful in a research context when trying to understand the electronic properties of a molecule and how it will interact with light. Additionally, MolAR also allows users to visualize vibrational normal modes for a selection of molecules of educational interest. Through this feature, users can learn about vibrational motion by watching atoms animating in a real-world scene.
Students can become familiar with the symmetry features of the vibrational modes, for example, distinguishing between symmetric and asymmetric stretching or understanding motions that induce a change in dipole in the context of spectroscopy. Most importantly, it helps students to understand that molecules are not static but dynamic, three-dimensional objects.
The MolAR work ow is summarized in Figure 2. When the user takes a picture of a chemical structure, the app starts tracking the position of the structure so that it can place the model there later. The picture is sent to a server, which feeds the image to Mathpix 25 to predict the SMILES representation of the structure. The server then sends the SMILES to the National Cancer Institute's Online SMILES Translator to obtain an SDF le with the molecular structure. 27 The server returns the SDF to the app. Finally, the app converts the SDF to a USDZ le, a format for AR models on iOS devices, and places the molecular structure above the hand-drawn image. If the user goes on to request computational chemistry properties, the SDF is sent to TeraChem Cloud 26 which returns the requested properties. An equivalent work ow is used for object recognition. Further technical details are provided in the Methods section and Supplementary Information.

Conclusions And Outlook
In conclusion, we developed MolAR, a mobile application that employs augmented reality, computational chemistry, and machine learning to transform images of hand-drawn chemical structures into 3D molecules in AR and compute their quantum mechanical properties. It does not require specialized hardware or even a desktop computer/laptop, making it easily accessible and convenient for users. MolAR is as an immersive training application to help students master mental visualization of the 3D structures of molecules. In addition, it provides researchers a platform for barrierless visualization of protein structures and analysis of molecular properties to aid in the understanding of chemical behavior.
The app serves as a building block which can be directly connected to a whole host of additional tools. In the future, we plan to incorporate dynamic motion, chemical reactions, and computation of more quantum mechanical properties, such as the excitation energy. We also plan to link MolAR to our recently reported ChemVox application that performs voice-activated quantum chemistry. 28 Overall, MolAR is the latest example of how technological progress can enhance scienti c research and education, offering another valuable tool in the scientist's current arsenal.

Methods
The MolAR app is written using the Swift programming language and Apple's iOS SDK, a software development kit for iPhone and iPad. It uses ARKit, 29 part of the iOS SDK, to implement augmented reality. ARKit has features for AR such as device motion tracking and scene processing. The app communicates with a web server whose main functionality is to recognize objects or chemical structures in an image. The server is written in Node.js. We describe the method in detail below. AR tracking. When the user takes a picture of an object or a chemical structure, the app places a virtual anchor at the target. The virtual anchor allows the app to track the location of the target as the user moves while the app is processing the picture. Once the app receives the 3D structure from the server, the app places it at the anchor.
Image to SMILES. The server feeds the image it receives to Mathpix 25 to predict the SMILES representation.
SMILES to SDF. We use the Online SMILES Translator by the National Cancer Institute to convert SMILES to SDF. The SDF data of a molecule contains the 3D coordinates of each atom in the molecule and the type of the bond between pairs of atoms.
SDF to geometric primitives. In this step, atoms and bonds in the SDF data are converted to spheres and cylinders. Atoms are represented by spheres, colored according to the CPK coloring scheme and sized according to a scaled van der Waals radius. Bonds are represented by cylinders, colored according to the atoms they connect.
Geometric primitives to USDZ. USDZ is a le format created by Pixar for interchange of 3D models. 30 iOS has a built-in viewer that can show USDZ les in AR. The app generates USDZ les from geometric primitives according to the le format speci cation. We choose to generate USDZ les on device rather than on the server because the USDZ le size is much larger than that of the geometric primitives.
Visualizing molecular vibrations. The USDZ le format supports keyframe animations. Each geometric object can be translated, rotated, or scaled by specifying the transformation matrices at different points in time. To visualize molecular vibrations, we animate atoms and bonds separately. The sphere for each atom is translated according to its trajectory. The cylinder for each bond is translated, rotated, and scaled to maintain the connection between the atoms.
Visualizing Proteins. When visualizing a protein, the app fetches the CIF data from the Protein Data Bank and then uses Mol*, 31 an open-source macromolecular toolkit, to generate 3D coordinates of the protein.
We use the cartoon representation for small or medium proteins and the Gaussian surface representation for large proteins to improve the display performance.
Object recognition. The work ow for object recognition is similar to that of Image-to-SMILES described above. When the user takes a picture of an object, the app also sends the picture to the server. The server uses Google Cloud Vision API and Amazon Rekognition for object detection. [32][33] We have developed a database that maps common objects to molecules; for example, coffee is mapped to caffeine, and carrots are mapped to carotene (Table S1). The server then sends the chemical name of a representative molecule of the photographed object to the app. As in the hand-drawn chemical structure work ow, the app tracks the object during the computation and renders the 3D chemical structure above it.
Quantum chemistry calculation. The electronic structure calculations were performed using TeraChem Cloud 26 with a PBE0/3-21G level of theory in gas phase.

Data availability
The MolAR program can be downloaded from the Apple App Store.

Code availability
The source code of MolAR can be found at https://github.com/mtzgroup/molar. Outline of MolAR functionalities, including hand-drawn chemical structure recognition, recognition of molecules in objects, visualization of molecules and biomolecules from public databases, visualization of vibrational normal modes of select molecules and calculation of electronic properties (including dipole moment and molecular orbitals). See the app in action at https://youtu.be/bLqkzz1vZL4.

Figure 2
The MolAR work ow for transforming a hand-drawn chemical structure into a 3D AR model. (1) The app sends the image of the structure to the server, which then (2) feeds the image to Mathpix 25 to predict the SMILES representation. (3) The SMILES is sent to the Online SMILES Translator by the National Cancer Institute to obtain the SDF data. (4) If requested, the server sends the SDF to TeraChem Cloud 26 to compute its chemical properties. Finally, the server returns the SDF and calculation results to the app.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.

si.pdf
TMartinezCS at.pdf TMartinezEPC at.pdf