Robotic Scientist platform. The platform towards a Robotic Scientist for digital synthesis nanocrystals involves convergence of the materials databases, cyber systems, and physical systems (Fig. 1).
To accomplish rational design of nanocrystals, the software and AI algorithms are integrated into the cyber system. In addition, process automation by means of a simulated operation system is utilized to pre-examine and monitor the designed synthesis procedures.
In the physical system, crystal growth on the nano-scale is accomplished by automatic synthesis and characterization is performed on the macro-scale to guide controllable synthesis. Concurrently, Robotic Execution Excel (REE) files are designed to provide preliminary instructions for the execution of automatic synthesis using crucial parameters. The database is expanded continuously by the design and controllable synthesis processes. Furthermore, the relationship between the target nanocrystal morphologies (as outputs) and key synthesis parameters in the database (as inputs) are identified to provide constructive guidance to achieve retrosynthesis. Finally, the close loop combining rational design, controllable synthesis, and retrosynthesis provides the unprecedented ability to manipulate the morphologies of nanocrystals. It is expected that the Robotic Scientist can be trained for digital synthesis of customized nanocrystals with the essential capacities similar to those provided by human scientists.
In manual synthesis, the tasks are normally time consuming and error prone and moreover, the raw precursors expire or degrade shortly after preparation in some cases. In order to achieve automatic synthesis in a timely fashion, the Robotic Scientist platform is set up with many desirable features as shown in Fig. 2. The robot, robotic arms, digital pipettors, mobile camera, and microplate reader are connected to a series of modules that are capable of performing robot-assisted high-throughput synthesis and in-situ characterization. The photograph, schematic representation, and operation video of the platform are presented in Fig. 2a, Fig. 2b, and SI, respectively. The Robotic Scientist platform is expected to revolutionize traditional synthesis processes that rely on well-trained scientists and technicians.
Rational design. Traditionally, the manual chemical and materials synthesis processes differ slightly from person to person and sometimes introduce inadvertent errors/bias leading to diverse outcome. Moreover, it typically takes several months and even years for a scientist to acquire the required repertoire of synthetic knowledge. Hence, there is a substantial demand to conduct rational design on the Robotic Scientist platform while leveraging the expertise of human scientists. Here, crystal informatics, existing knowledge about synthesis, thermodynamic models, and kinetic models as data-driven scientific hypotheses are integrated into the Robotic Scientist for rational design of nanocrystals (Fig. 3).
Firstly, a crystal database with over 90,000 different crystal facets from seven crystal systems is incorporated into the Robotic Scientist based on our previous research8. The typical morphologies in the cubic system are identified in Fig. 3a and Fig. 3b. The morphology information is then digitally converted to the fractional surface area (FSA) and aspect ratio (AR) and the correlations are analyzed as shown in Fig. S1, in which the FSAs of the (001) and (00\(\stackrel{-}{1}\)) planes versus AR of the corresponding nanorods are identified revealing a gradually decreasing trend (Fig. 3a). Afterwards, by exploiting the advantages of the artificial neural network (ANN) model to understand the complex morphology evolution process, the relationship between the crystal equilibrium morphology (FSA and AR related) and surface energy ratio is established using a well-trained ANN model (Fig. S2) based on the crystal informatics database.
To train the Robotic Scientist, Au nanocrystals synthesis knowledge with key parameters is extracted from 1,300 related literatures by data mining with the aid of the Automated Literature Recommendation System9. Fig. S3 shows the frequency distribution of the synthesis parameters reported in the literatures and Fig. 3c indicates that L2 is the most frequently used concentrations. Hence, by taking advantage of data mining, the Robotic Scientist is initially trained to capture synthesis parameters and the identified parameters are then adopted by the Robotic Scientist platform to refine predictions. For example, longitudinal surface plasmon resonance (LSPR) can be characterized in-situ by the Robotic Scientist platform (Fig. S4 and Table S1) and some of the samples are characterized by ex-situ TEM (Fig. S5), XRD, and HR-TEM (Fig. S6) to provide the necessary conditions for the Au nanocrystals10. Consequently, the relationship between the customized morphological FSA and AR is established with accurate one-to-one correspondence as shown in Fig. 3b.
To bridge robot-assisted macro-scale operational synthesis and nano-scale crystal growth, the classic thermodynamic model (derived in Method section) and ML-predicted model are explored by taking [Ag+] as an example (Figs. 3d and 3e). The correlation between LSPR (related to morphology) and [Ag+] concentration is developed by investigation of the classic model. Furthermore, we have found that the ML-predicted model achieves an extended LSPR range (600-925 nm in Fig. 3e) and accurate prediction (R2 = 0.99 in Fig. 3e) in comparison with the classical model (666-878 nm with R2 = 0.98 in Fig. 3d). Therefore, with the assistance of ML and thermodynamic models, the relationship among morphology, surface energy, LSPR, and [Ag+] concentration is established by the Robotic Scientist platform. Establishment of the thermodynamic model allows the Robotic Scientist platform to realize rational design of desirable nanocrystals using the concentration of synthesis parameters as the input, surface energy and LSPR as the bridge, and nanocrystal morphology as the output.
The kinetics in nanocrystals synthesis is another key model in rational design that can train the Robotic Scientist for tailoring morphology. In this respect, a microplate reader and color-ultra-sensitive camera are employed to monitor the UV-Vis-NIR absorption spectra and color changes during nanocrystal growth. The dynamic-state and steady-state optical absorption spectra are displayed in Figs. S7-S10 together with representative results in Fig. 3f-3h for different C(HCl). The dynamic UV-Vis-NIR absorption spectra with peaks of LSPR and transverse surface plasmon resonance (TSPR) are identified in Fig. 3f. The normalized ODLSPR change with time is shown in Fig. 3g, which indicates the pseudo-first-order kinetics (derived in the Method section and shown in Fig. S7 and Table S2). A similar trend showing the color change (RGB values) with time is presented in Fig. 3h and Fig. S7. These in-situ characterization results are employed to establish the nanocrystal kinetic models. Hence, the Robotic Scientist is guided by the thermodynamic and kinetic models with ML trained models to explore controllable synthesis and retrosynthesis.
Controllable synthesis. The complexity of materials synthesis increases exponentially with the number of variables, thereby stifling full exploration of the materials space. The key to controllable synthesis process is convergence of macro-scale automatic synthesis and nano-scale crystal growth to bridge the synthesis parameters (as input) and corresponding morphologies (as output) on the Robotic Scientist platform. In order to achieve this objective, data-intensive rational design and automated synthesis are integrated. Meanwhile, machine learning and experimental data are utilized to construct models based on the appropriate synthesis variables. As a result, orthogonal, single-, double-, and triple-factor experiments can be conducted systematically in the order of iterations to construct the database for effective training of ML models.
Firstly, orthogonal experiments are conducted by executing materials synthesis with parameters by data mining from 1,300 papers (Fig. 3c). They are designed to address the limitations of blind optimization for all the factors at different levels1. The design of experiments with different factors and levels (Table S3), UV-Vis-NIR absorption results (Fig. S11), and multivariate analysis of the variance (Table S4-S5) are presented. Based on the experimental conditions from the high-dimensional experimental space, the initial optimized levels are decided for further single-factor study.
To analyze the potentials in 1D space, 24 levels are studied for each single factor (Table S6) and the models of single factors are presented in Figs. 4a–4c and Fig. S12, respectively. Moreover, 96-level experiments are carried out within the boundaries identified from the 24-level experimental results to provide more training data for the ML models. The ML models, corresponding coefficient, and accuracy of ML prediction are presented in Tables S7-S8 and Fig. S13. All the results can be fitted well with the ML predicted models, which are beyond the capacity of the classical model (merely fitted with the results of AgNO3 factor) in Fig. 3d. Primarily, there is a border range of AR tuned by CTAB, AgNO3, and HCl (compared with Au Seeds, AA, and HAuCl4), which are identified and defined as structure-directing agents (SDAs)11–14. The different types of the SDAs can be used as triggers on the macro-scale to control the surface energy during nanocrystal growth on the nano-scale. For example, the factor of AgNO3 can be adjusted by the Robotic Scientist platform to change the AR values of the nanocrystals in Fig. 4a. Therefore, the relationship between the SDAs-based synthesis parameters (inputs) and nanocrystal morphologies (outputs) is identified as the key to achieving controllable synthesis.
To train a sophisticated Robotic Scientist, double-factor experiments are conducted for two identified SDAs from the single-factor experiments. In this way, the chemical space is expanded into the 2D response surface with an 8×8 grid (64 experiments) compared to 1D curves derived from single-factor experiments. Based on 64 preliminary experiments, 96 experimental conditions are generated by a normal distribution mathematical array for active training of the ML model. The design of the double-factor experiments and ML predicted models are presented in Tables S9-S13 and the results are illustrated in Figs. S14-S16. The robust double-factor ML models are then trained with two inputs for morphological control. It is found that CTAB and AgNO3 play dominant roles and there are noticeable interactions (Fig. 4d), which are consistent with the observation that CTAB and Ag+ form a face-specific capping agent to achieve cooperated morphological control15. Interestingly, the CTAB and HCl factors exhibit similar behavior of the cooperated morphology control (Fig. 4e). However, there is only additive behavior for the AgNO3 and HCl factors (Fig. 4f) and AgNO3 plays a leading role in the two-factor experiment. A complex response profile is created for the three-factor experiments by adjusting three SDAs. The design of triple-factor experiments and ML predicted models are shown in Tables S14-S16. The visualized response of AR to the three factors is presented in Fig. 4g and Fig. S17. Therefore, the function of the SDAs’ parameters as inputs and AR features as outputs can be established for controllable synthesis of the nanocrystals in a free 3D space.
At the same time, the color features as potential outputs can be investigated by the Robotic Scientist platform. The results from single-factor, double-factor, and triple-factor experiments are shown in Figs. S18-S20, respectively and the corresponding LSPR and RGB values are listed in Tables S17-S19 for ML training. The trained ML model and comparison between experimental and ML predicted values (Table S20 and Fig. S21), in which a satisfactory ML model with an R2 of 0.94, are obtained. As shown in Fig. 4h, the color results as another large-sample data-set match the spectra well. In this way, the Robotic Scientist can be trained to digitally recognize colors, thus contributing to the materials genome database with color features.
Finally, with the aid of the Robotic Scientist platform, over 2,300 samples are synthesized together with in situ characterization to build up the Au nanocrystals genome (various morphologies with LSPR from 600 to 1,000 nm) (Fig. 4i). It is estimated that this task would have taken a human scientist up to four months (18 samples per day) in comparison with less than one week (384 samples on four 96-well microplates per day) taken by the Robotic Scientist. The Robotic Scientist continues to improve by receiving training with expanding experimental data and ML predicted data to realize the ultimate goal of an intelligent system for digital nanocrystal synthesis and potential of retrosynthesis based on the data sources as described in the next section.
Retrosynthesis. The Robotic Scientist is further developed with the intention of retrosynthesis based on the learned knowledge from controllable synthesis. The Au nanocrystals genome plays a vital role in supporting a closed-loop synthesis process. The genome with typical LSPR from 600 to 1,000 nm displayed in Fig. 5a consists of experimental data, ML predictable data, and TEM validation results (Fig. 5b). Building such a genome within a six-variable experimental space seems like an impossible task with the manual approach due to the experimental complexity that scales exponentially with the number of variables1. The relationship between the identified SDAs and morphologies is illustrated as ‘Input’ and ‘Output’ in Fig. 5a, respectively. By normalizing different parameters of SDAs to form different nanocrystal morphologies with the trigger of the surface energy on the nano-scale, precise morphological control is accomplished. It is constructed for effective retrosynthesis (Fig. 5c) and efficient scale-up production of Au nanocrystals (Fig. 5d and 5e) to facilitate digital synthesis of Au nanocrystals.
Retrosynthesis and optimization are the Robotic Scientist’s creative endeavors. The data of the target Au nanocrystals (such as LSPR as 808 ± 10 nm, 780 ± 10 nm, and 633 ± 10 nm), which are commonly used in biotechnology and information technology (for example, HIV drug delivery16, surface-enhanced Raman scattering17, wireless neuromodulation18, and sensing19,20), are extracted from the genome for further retrosynthesis study. Using 808 ± 10 nm as an example, 99 samples are selected from previous single-, double-, and three-factor experiments as shown in Fig. 5c and Table S21. At the same time, by focusing on the best samples, additional samples are predicted by the ML models. Afterwards, optimization experiments are executed by the Robotic Scientist platform. It is generally accepted that a larger OD ratio (ODLSPR/ODTSPR) and narrower FWHM (at fixed LSPR) represent more uniform morphology. Hence, the experiments are designed with a decision plate to optimize the target nanocrystals with higher shape uniformity by evaluating the OD ratio and FWHM of samples from different synthesis routes (Fig. 5c and Table S22). Finally, the best samples in the decision plate with the best quality are recommended for the scale-up experiments.
Three scale-up experiments are conducted sequentially, i.e., high-throughput microplate assay on the Robotic Scientist platform (in Fig. 5d), bench-scale test on a magnetic stirrer, and pilot-scale test in an agitated vessel (in Fig. 5e). Firstly, 2 mL- (on 12-well microplate), 4 mL- (on 6-well microplate), 20 mL- and 40 mL- (on single-well plate) scale experiments are performed on the Robotic Scientist platform for 633, 780, and 808 nm samples synthesis. During the scale-up process, an interesting feature in retrosynthesis is that LSPR gradually red-shifts compared to results in the nanocrystal genome (Fig. 5d), which provides new insights into the scaling law. By taking advantage of kinetics study, SDAs (such as HCl) is identified as the effective input to play the minor modification role in the scale-up process. A slight decrease of c(HCl) adjusts the LSPR according to the established scaling law. Modification by adjusting c(HCl) is proven to be applicable and then a pilot-scale experiment (15 L) is demonstrated in an agitated vessel (Fig. S22). Therefore, this study reveals retrosynthesis and scale-up methodology by taking advantage of the Au nanocrystals genome and kinetics study on the Robotic Scientist platform, which is expected to have broad applications in the production of similar nanocrystals.