Producing pharmaceutical products without untoward effects but with desired qualities is not only the very basic requirement set by regulatory authorities but also indirectly decides the success of the pharma industry. The pharmaceutical product development process is a combined and coordinated work from numerous divisions of the pharma industry which usually starts from API discovery, synthesis and utilization to end with formulation development, market positioning and successful use at the consumer end.
Conventional API discovery and development process as well as optimization of analytical method and formula of dosage form routinely use the quality by testing (QbT) approach or one factor at a time (OFAT) strategy. Since the time consuming and chemical wastage in OFAT strategy or QbT approach is inevitable, the looking for a better alternative approach or strategy which shows less time consumption in conjunction with minimal chemical wastage becomes an urgent requirement. In terms of the response surface methodology (RSM) approach, the design of experiment (DoE) is being utilized effectively to optimize APIs synthetic process, analytical method development and formula optimization for final pharmaceutical products (Rahman et al., 2021; Rahman et al., 2020). It should be added that the RSM-linked DoEis based on a linear model and therefore it won’t consider the non-linear modelization concept to see the influence of independent factors on the response variables. Thus, depending solely on the DoEfor such optimization processes may end with an erroneous conclusion and thus necessitates applying another approach (preferably of non-linear model based) which judiciously eliminates the conclusion errors noticed with the DoE-based optimization process. One such non-linear model-based approach recently introduced in pharmaceutical product development process is artificial intelligence and deep/machine learning (AI & D/ML) or artificial neural network (ANN). Since the information processing capacity of ANN is related to the functioning of the normal human brain, the estimation of the process parameters is being carried out on the number of trials performed by varying the composition of the excipients and processing conditions (Ghate et al., 2019). AI is a division of computer science, involved in problem resolution and in creating machines that can perform tasks which would otherwise require intelligence and human operators (Senthuraman, 2020). In simple words, AI & D/ML is a branch of computer science that deals with problem-solving through the aid of symbolic programming (Krishnaveni et al. 2019). The DL consists of a neural network of multiple layers that aim to emulate how the information processing is carried out by the human brain after understanding complicated patterns and feature interactions. In practical situations, deep learning helps to give structure to unstructured data and enables machines to learn to classify data without assistance (Abhinav & Subrahmanyam, 2019; Gilvary et al. 2019).The ML is a subset of AI utilizing algorithm models and uses statistical methods with the ability to learn with or without being explicitly programmed (Abhinav & Subrahmanyam, 2019).It should be added that the AI & D/ML or ANN is based on a non-linear concept to study the independent factor’s influence on response variables. Non-linearity refers to a massive parallel network distributed throughout that allows for approximation and real-time operation to exhibit unpredictability and random behaviour. Figure 1 shows the possible areas of AI & D/ML or ANN in the pharmacy field of the healthcare system.
[Insert Fig. 1 here]
There have been several excellent articles in both reviews in general (Cavasotto & Di Filippo, 2021; Chaki et al., 2020; Damiati et al., 2020; Paul et al. 2021, Zhou et al. 2020) and an entire issue of a journal devoted solely to the subject of AI & D/ML or ANN (Artificial Intelliegence). Furthermore, as one of the non-linear models for pharmaceutical product optimization, the AI & D/ML or ANN is either combined with a linear model like RSM (Ghate et al., 2019; El Menshawe et al., 2014; Naveen et al., 2020) or only ANN-supported formula optimization of dosage form (Moghaddam et al., 2010; Manda et al., 2019; Lefnaoui et al., 2020) and food products (Rakhavan et al., 2016; Dash & Das 2021; Samson et al., 2016). Similarly, the optimization of analytical method conditions for APIs are also articulated through the AI & D/ML or ANN (Arabzadeh et al., 2019) with support from RSM (Rahman et al., 2021). The objectives of this “meta-review” are (1) to provide a historical overview of AI & D/ML or ANN, (2) to update the financial dealings of pharma companies related to the application of AI & D/ML or ANN in drug discovery and development processes and (3) to showcase the application of in AI & D/ML or ANN concept for optimization of analytical method conditions and formula of the dosage form.
Historical overview on AI & D/ML or ANN
Table 1 delineatesthe historical overview on AI & D/ML or ANN in ascending order of milestone years. The ANN has a history that dates to the precomputer era although the original goal of AI & D/ML or ANN is to solve problems mostly related to biology, the extrapolation of its applicability is much more in recent years and even the intriguing nature of AI & D/ML or ANN especially in medical diagnosis principles as well as in forecasting/predicting the disease pattern in a particular region or whole world. Because firstly the AI & D/ML or ANN helps us understand the impact of increasing/decreasing disease progression vertically or horizontally on computational time. Secondly, the AI & D/ML or ANN helps us understand the situations or cases where the model fits best. Thirdly, it also explains why the certain model works better in certain environment or situation. Samson et al., (2016) described the ANN as an information-processing paradigm that is related to biological nervous systems i.e., the human brain. In the early 1940s (McCulloch & Pitts, 1943), researchers developed the “threshold logic” model, which encompassed a two-pronged approach to computational models of ANN. This model is directed at identifying the biological neural networks separate from ascertaining the correlation of the neural networks to artificial intelligence. An unsupervised learning model utilizing neural plasticity and long-term potentiation has been introduced in the form of Hebbian learning (Morris, 1999). This type of learning was utilized in calculators and other computational instruments (Rochester et al., 1956; Farely & Clark, 1954). Development of the perception in the model added another dimension to AI & D/ML or ANN by incorporating a two-layer computer network in pattern recognition through an algorithm (Rosenblatt, 1958). Particularly, the ANNs comprise a set of nodes, each of which receives a separate input, which is finally converted to output. After the introduction of a clear definition of backpropagation (Werbos, 1975), the supervised learning method in which the ANN receives training in conjunction with optimization becomes easier to determine the loss of function during the anticipated output for each input value is known. In this way, the ANNs are linked to single or multiple algorithms to solve the problems (Paul et al., 2021).Although the AI & D/ML were coined around in the 1950s, it now becomes a slogan in pharma industries especially after finding out their benefits in handling increased volumes of raw data following the introduction of advanced algorithms (Sharma, 2019).
[Insert Table 1 here]
Pharma companies’ financial dealings related to the application of AI & D/ML or ANN in drug discovery and development processes
Table 2 displays the financial dealings of Pharma companies concerning the application part of AI & D/ML or ANN, particularly in drug discovery and development processes. The entry of AI & D/ML or ANN helps to shorten not only the new drug development period but also it significantly minimizes the utilization of manpower and considerable reduction in expenditure related to the API development. For example, the German-based biotechnology company, Evotec, has partnered with a UK-based company, Exscientia, for the small molecule drug discovery process. Within a short period of 8 months, the discovered small drug molecule entered Phase 1 clinical trials which might usually have taken 4-5 years to deliver the drug candidate from the traditional drug discovery process (without utilizing AI & D/ML or ANN).
[Insert Table 2 here]
AI & D/ML or ANN in optimization of analytical method conditions and formula of dosage form
Before entering the discussion related to optimization of analytical method conditions and formula of dosage form,it needs to be emphasized that the ANN simply mimics the principles of information processing handled by the human brain wherein the influence of critical material attributes variation on critical analytical attributes (CAAs) can be predicted by segregating different sets of data (generated from numerous trails) into training, testing and validating (Samson et al., 2016). For this purpose, the ANN must be coupled with an algorithm to attain the “best fit” optimum values for a method (Ghaheri et al., 2015). The ANN-linked algorithm produces a highly reliable and better predictor of the optimum values for a method than the RSM-based linear model (Sha & Edwards, 2007). However, the ANN relies on the number of experiments/trials conducted and consequently, it is highly likely that too high/a smaller number of trials would result in error and fault in the predictions (Ghate et al., 2019).Therefore, the ANN takes the trials of the linear model (RSM following face-centred central composite design (CCD) for generating the non-linear model [ANN-linked Levenberg-Marquardt (LM) algorithm] to predict the optimum regions for the studied CAAs (Rahman et al., 2021).Furthermore, the ANN-linked LM is a potent chemometrics method because of its high performance and good prediction for non-linear systems (Ghaedi, 2015). The typical network architecture of AI & D/ML or ANN is organized in three-different layers, viz., one input, one output and one or more hidden layers. Figure 2 portrays the schematic architecture of AI & D/ML or ANN having three input, ten hidden and three output layers (3:10:3). The architectural structure of AI & D/ML or ANN is the most common multi-layered perceptron (MLP) type which is built on four different elements, input, hidden and output layers along with connections or weights. Interestingly, the MLP type AI & D/ML or ANN works in two phases, training and testing. The training phase is based on the iterative demonstration of the available data pattern to teach the AI & D/ML or ANN for accomplishing the designated assignment. Figure 3 shows the various steps involved in developing the neural network. Other frequently used neuronal network combinations are the kohonen network, convolutional neural network (CNN) and recurrent neural network (RNN).
[Insert Figs. 2 & 3 here]
Figure 4 depicts the possible way to integrate AI & D/ML or ANN in the optimization of analytical method conditions and formula of the dosage form. It can be seen from Figure 4 that the AI & D/ML or ANN needs to be integrated with the DoE approach for optimizing conditions for analytical method and formula for the dosage form. This type of integration between AI & D/ML or ANN and DoE allows the coining of new terminology called, “double-stage systematic optimization”. The double-stage systematic optimization was therefore started initially by using the conventional DoEapproach and then by the application of AI & D/ML or ANN. For instance, Rahman et al., (2021) have used the RSM generated from face-centered CCD of DoE while the ANN is linked with the LM algorithm of AI & D/ML or ANN. Table 3 displays selected non-comprehensive publications showing the involvement of AI & D/ML or ANN in the optimization of analytical method conditions and the formula of the dosage form.
[Insert Figure 4 & Table 3 here]
Because of AI & D/ML or ANN’s advantage in dealing with complex and unstructured data, it is well suited for addressing a wide range of applications in the pharmaceutical sciences and easing up the process (Simões et al., 2020). Table 4 displays the various algorithms usage coupled with AI & D/ML or ANN in different pharmaceutical product development processes.
[Insert Table 4 here]
Representing the drug release process by using computationally simple empirical models is a challenging task since there are complicated interactions between formulation and processing variables. The effort in the pre-prescription step would be considerably reduced if AI & D/ML or ANN model could forecast drug release, and the accuracy of the predictions has been proven. Nagy et al., (2019) used the near-infrared (NIR) and Raman spectra to compare four three-layer ANN models to the standard partial least square (PLS) regression to predict the dissolution profile of extended-release anhydrous caffeine tablets. Brahima et al., (2017) used an MLP for the modelling of riboflavin release behaviour from poly(NIPA-co-AAc) hydrogels. The results showed that the function of ANN was validated, and when compared to the RSM using the mean square error (MSE), the ANN was more appropriate for predicting the release of riboflavin hydrogels and had great generality over the release behaviour of hydrogel. Additionally, Elman neural networks (ENNs) and other dynamic neural networks can also be used to forecast dissolution profiles. Petrovic et al., (2012) used both an ENN and an MLP to characterise the release curve of tablets and determined the wide applicability of ANN. Husseini et al., (2009) and Moussa et al., (2017) used ANN to optimise the ultrasonic release of APIs in preparations (such as liposomes and micelles) to keep therapeutic concentrations constant at specific sites. Han et al., (2018) predicted the disintegrating time of disintegrating oral tablets by neural network techniques. A few selected examples of analytical method condition optimization and formula optimization by integrating the AI & D/ML or ANN concept are narrated below.
SVM in formulation development
Wang et al. (2022) used particle swarm optimization along with the least square support vector machine (PSO-LSSVM) to simplify the optimization process. The results of the prediction model and Taguchi design were compared with PSO-LSSVM. Additionally, this model provided lower costs and a more efficient design of pharmaceutical formulation.
SVM in analytical method development
Keyvan et al., (2021)suggested UV spectrophotometric method development using feed-forward artificial neural network (FFNN) and least square support vector machine (LS-SVM) to simultaneously investigate Sofosbuvir and Daclatasvir in tablet production and biological fluid. Results indicated that the technique has a high potential for predicting component concentrations in dosage forms with a shorter analysis time.
GA in formulation development
Kumar and Kumar (2019) used integral hybrid GA with BPANN and RSM based on a central composite design considering water fraction, surfactant fraction, powder density and ultrasonication time as analysing parameters. Results indicated that the multi-objective hybrid GA model was successful in establishing robust results compared to the conventional method.
GA in analytical method development
In the study by Attia et al., (2021)GA-ANN was used to quantitatively analyse the UV absorption of velpatasvir and sofosbuvir which revealed some overlap, indicating difficulty in the simultaneous estimation of two drugs. GA-ANN proved to be effective in estimating drugs, with acceptable values of root mean square errors for calibration and prediction.
Autoencoder in analytical method development
Kensert et al. (2021)developed a deep one-dimensional convolutional autoencoder that simultaneously eliminates baseline noise and baseline drift to detect and quantify analytes in a mixture of chromatograms with high number and diversity surpassing the approaches like Savitzky-Golay smoothing, Gaussian smoothing and wavelet smoothing.
AI & D/ML or ANN in optimizing eutectic solvent system
The use of AI & ML or ANN to predict and select solvent systems is a very interesting integration of academia and industry. AI & D/ML and ANN can assist in optimizing eutectic systems by designing a solvent system based on appropriate properties. By involving AI & D/ML or ANN-based algorithms, the eutectic solvent system may be chosen by automatically separating the products from the reaction solution (self-precipitation) (Amar et al., 2019; Von Lilienfeld, 2018).