Computing optimal drug dosing with OptiDose: implementation in NONMEM

Determining a drug dosing recommendation with a PKPD model can be a laborious and complex task. Recently, an optimal dosing algorithm (OptiDose) was developed to compute the optimal doses for any pharmacometrics/PKPD model for a given dosing scenario. In the present work, we reformulate the underlying optimal control problem and elaborate how to solve it with standard commands in the software NONMEM. To demonstrate the potential of the OptiDose implementation in NONMEM, four relevant but substantially different optimal dosing tasks are solved. In addition, the impact of different dosing scenarios as well as the choice of the therapeutic goal on the computed optimal doses are discussed.


Introduction
Determining the drug dosing for a single individual or the typical individual representing a population is an important task in clinical application and drug development [1]. With a developed pharmacometrics / PKPD model, usually numerous simulations for various dosing scenarios are performed, and the best dosing is selected by the modeler based on the most suitable response. While the modeling and simulation procedure has remarkably improved dosing recommendations for many drugs [2], trial-and-error approaches remain time-consuming and limited to the choices of the tested doses, likely missing out on the optimal doses. Using optimal control theory [3,4], optimality can be defined and quantified in terms of how well a certain goal is achieved, and thus provide a rational base for finding the optimal dosing. Moore [5] provided an excellent introduction to optimal control in PKPD modeling which has been utilized to optimize, e.g., HIV and cancer treatments [6][7][8][9][10], and intervention strategies in infectious diseases [11][12][13]. Recently, Gontijo et al. [14] applied optimal control in a linear setting for Colistin dose selection.
The optimal dosing algorithm OptiDose [15,16] solves an optimal control problem (OCP) to compute the doses which bring the associated response of the PKPD model as closely as possible to a desired therapeutic goal. OptiDose was specifically designed for optimal dosing tasks in PKPD modeling since it optimizes the drug doses, i.e., the control variables, generating the drug level according to the dosing scenario which then drives the model response, and it is robust with respect to jumps in the model solution occurring for intravenous (IV) bolus, oral and subcutaneous dose administration.
To set up an optimal dosing task in OptiDose, the following three parts are essential. First, a PKPD model with fixed model parameters is necessary. Second, the dosing scenario must be defined, i.e., dosing time points, route of administration, maximal tolerable doses, and how often the dose can vary, e.g., daily, weekly, etc. Third, the therapeutic goal needs to be characterized by a mathematical function called the reference function in [15].
Here, we present a reformulation of the OCP solved in OptiDose as an optimization problem which can be implemented in NONMEM [17]. Similar to estimating model parameters for a PKPD model based on a given objective function, now the doses for the PKPD model with fixed model parameters are computed by minimizing an appropriate objective function. For example, the idea to treat the PK parameters as independent variables and estimate optimal dose rates in NONMEM based on defining a target variable and an objective function penalizing the deviation from the target was previously published by Jönsson et al. [18].
The paper is organized as follows. In the Motivation section, the optimal dosing task is illustrated based on an indirect response model [19,20]. In the Theoretical section, the general case of an OCP and its reformulation as an optimization problem are presented. In the Methods section, the necessary files and commands to solve an optimal dosing task in NON-MEM are explained. In the Results section, the OptiDose implementation in NONMEM is applied to four relevant but substantially different PKPD models with different optimal dosing tasks. First, the indirect response model from the Motivation section is considered, and the impact of different dosing scenarios and different reference functions describing the therapeutic goal is investigated. Second, a complex multidimensional binding kinetics model for a bispecific antibody [21] is applied with the therapeutic goal to keep the ternary drug-receptor complex at the most beneficial level. Third, a PK model is utilized with the therapeutic goal to reach a target AUC of the drug concentration. Fourth, a delay differential equation model characterizing rheumatoid arthritis [22][23][24] is implemented with the therapeutic goal to stabilize the disease progression. Finally, the paper is completed by concluding remarks in the Discussion section.

Motivation
As a motivating example, we consider an elevated biomarker B, characterized by an indirect response model, which shall return safely to the healthy level, e.g., a reference range. The drug is administered via IV bolus into the central compartment A with volume of distribution V , and the resulting concentration C = A/V stimulates the outflow of the biomarker. The PKPD model reads The inflow I n(t, D) describes the drug administration of the doses D, and the model parameters θ = (V , B 0 , k out , k in , k el , E max , EC 50 ) characterize a single individual or the typical individual of a population.
The therapeutic goal is to control the biomarker B to follow a predefined reference function B re f specifying the desired response. During the loading phase of two weeks, i.e., for 0 ≤ t ≤ 14 days, an approach towards the target biomarker level B tar is characterized by the quadratic function which is then followed by the maintenance phase for 14 ≤ t ≤ 42 days where the target biomarker level shall be maintained, i.e., The therapeutic goal of Eq. (2) has been designed to meet the clinical requirement that it quickly declines while avoiding sudden changes, compare Fig. 1. A clinically realistic dosing scenario for non-hospitalized patients is a daily administration of doses that may change on a weekly basis. The aim is to compute the six doses D = (D 1 , . . . , D 6 ) for the six weeks which bring the biomarker level B, described by the PKPD model, as closely as possible to the desired reference B re f . Mathematically, this is achieved by minimizing the objective function This objective function value (OFV) can also be computed via appending an artificial state variable obj, i.e., to the PKPD model equations which then allows to solve the optimization problem in NONMEM. In Fig. 1, the therapeutic goal is depicted as well as the associated response of the PKPD model for an initial guess of the doses of 1 for all six weeks, i.e., D i = 1, i = 1, . . . , 6. In addition, the OFV is illustrated by the shaded area between the two curves.

Theoretical
In this section, the general case of an optimal dosing task is presented. First, the PKPD model equations are stated in their general form. Second, for given dosing scenario and therapeutic goal, the optimal control problem of OptiDose is formulated. Third, its reformulation as an optimization problem, which allows an implementation in NONMEM, is discussed.

The general PKPD model
A PKPD model is a system of parameter-dependent ordinary differential equations d dt y(t) = f (t, θ, y(t)) + I n(t, D) e k , y(0) = y 0 (θ ) (4) defined in the time interval t ∈ [0, T ] where f : [0, T ] × Θ × R n → R n describes the PKPD mechanism, i.e., the control system. Further, θ denotes the model parameter in the set Θ of admissible model parameters. Drug administration is denoted by I n : [0, T ] × R m → R, e.g., orally or subcutaneously, as IV bolus injection or as IV infusion, respectively. The dosing compartment is indicated by k, e.g., k = 1 in the motivating example, and e k ∈ R n is the corresponding unit vector. The doses D = (D 1 , . . . , D m ) ∈ R m are administered at fixed dosing time points specified in the dosing scenario. For given D the so-called state equation Eq. (4) admits a unique solution y(D) : [0, T ] → R n (under smoothness assumptions on the right-hand side, see [15]).

The optimal control problem
With the optimal doses, the aim is to achieve a predefined therapeutic goal specified by a reference function h re f : [0, T ] × Θ → R. In addition, a function h : R n → R describes the associated response of the PKPD model Eq. (4), i.e., of the state solution y(D), with respect to the current doses D. The objective function measures the difference between the associated response and desired therapeutic goal. Consequently, the optimal doses D * minimize the objective function Eq.

The reformulation as optimization problem
To compute the optimal doses D * in the software NONMEM, the OCP Eq. (6) is reformulated as an equivalent optimization problem. Moreover, for some applications, such as to control the AUC of a PK process, a slightly more general objective function is needed with a function γ : recovers the previous objective function Eq. (5).
Then the augmented state equation for z = (y, obj) is given by Eqs. (4), (9) with solution z(D) : [0, T ] → R n+1 , and the computation of the OFV is hidden in the additional state variable obj. This can be seen straightforward from Equation (10) shows that the optimization problem, defined by the augmented state equation Eqs. (4), (9) together with the transformed function where e n+1 ∈ R n+1 denotes the (n + 1)-th unit vector, is equivalent to the OCP defined by Eqs. (4), (7). To further enlarge the variety of applications, slightly more general as in Eq. (11), we introduce with a nonlinear function L : R n+1 → R and an additional term R : R m → R depending on the administered doses, compare [15] where R(D) = m i=1 α i n i D i with weights α i and the number n i of repeated administrations of each dose D i in favor of a linear dependency [5] to lower drug doses.
Solving the reformulated problem means minimizing the function J tr in Eq. (12) subject to the augmented state equation Eqs. (4), (9). In contrast to the original OCP Eq. (6), the objective function of the reformulated problem only depends on the solution of the augmented state equation at the end point T of the observation interval. Thus, the reformulated problem is an optimization problem and can be interpreted as a classical nonlinear least squares problem with one data point arising from an observation at the end point T .

Methods
NONMEM [17] provides all necessary tools to be repurposed for solving user-defined optimization problems. In the following, the structural form of a NONMEM data file and control stream to solve an optimization problem Eqs. (4), (9), (12) is presented. We focus on the differences compared to estimating model parameters of a nonlinear mixed effects model (NLME) illustrated alongside the already introduced motivating example of the indirect response model for an elevated biomarker, which is also the first example in the Results section. In the Appendix, a schematic overview of the Methods section and the full code of the examples discussed in the Results section are available.

The NONMEM data file
The dosing time points are defined utilizing the dose records in the data file. In the motivating example, we consider a daily administration from days 0 to 41. In the NLME context, AMT specifies the dose which actually was administered. In the optimization context, AMT will serve only as a factor multiplied with the dose, therefore, we set AMT = 1.0. In addition, in case of IV infusion, the additional column RATE = -2 is required, and the duration of the infusion is specified in the control stream. If the drug is not administered into the first compartment, this needs to be indicated with an additional column CMT.
The dose records are followed by one observation record at the final time T , which has an observed value DV = 0.0 as place holder, e.g.,

The NONMEM control stream file
In contrast to a model parameter estimation task, now the doses are estimated. Therefore, the doses D are associated with the estimation parameters THETA, i.e., D 1 = THETA (1), The F1 is a scale factor for any inputted dose, i.e., F1*AMT is the input into the dosing compartment which is the first compartment. Note that F2 is the corresponding scale factor for a dose administration into the second compartment, etc. In our optimization context, we choose AMT = 1.0 in the data file such that F1*AMT = F1 serves as the dose. Hence, the F1 scale changes according to the dosing scenario, e.g.,  (6) At each dosing record in the data file with dosing time point between 0 and 6, the dose D 1 = THETA(1) is administered, i.e., seven times, then at each dosing time point between 7 and 13 the dose D 2 = THETA(2) is administered, etc. This defines a daily administration, but the dose changes only every week. In addition, for IV infusion, the infusion duration needs to be specified, e.g., by D1 if the dosing compartment is the first.
The model equations are handed over to NONMEM using $DES as usual, e.g., given by DADT (1)  for the motivating example.
The model output Y is defined as the objective function Eq. (12) which shall be minimized, e.g.,

$ERROR Y = A(3)
Note that in contrast to the NLME context, no error model is needed because there is no randomness, and absolute optimal values for the doses are fitted without any variability or uncertainty.
By $THETA (0.0,1.0,1.0E+05) X 6 we indicate that for all six doses the lower bound is 0, the initial guess for the dose is 1.0 and the upper bound 10 5 .
Finally, we run the first order estimation to compute the optimal doses THETA via $EST -2LL NSIG = 5 The -2LL option indicates the minimization of the objective function as well as that the OFV is given by the model output Y as specified in the $ERROR block. The number of significant digits NSIG tightens the stopping criterion for the optimization and improves the accuracy of the gradient of the objective function.

Results
In this section, we present four examples of PKPD models covering a broad variety of optimal dosing tasks arising in drug development and clinical pharmacology. The data files and control streams to set up each optimization problem in NONMEM can be found in the Appendix. The tolerances for solving the differential equation ranged from TOL = 6 to TOL = 10. For the number of significant digits, we recommend at least NSIG = 5 . Both choices are problem-specific and depend on the desired accuracy of the optimal doses, the OFV and the gradient.
All solutions were verified by the original OptiDose implementation [15] in MATLAB [25] which was also utilized to create the figures. All computations were performed on an ASUSTek computer with Intel(R) Core(TM) i7-7700HQ CPU processor with 2.80GHz and 16GB RAM. In the following, the impact of the problem setup of the optimal dosing task, such as different dosing scenarios or different reference functions, on the optimal doses is discussed.

Dosing scenario A
The optimal solution with the associated response and the desired therapeutic goal is shown in Fig. 2a. The optimal doses and the resulting drug concentration are depicted in Fig. 2b.
Starting from an initial guess of 1 for all six doses, NON-MEM computes the optimal doses within 0.22 s with OFV of 3.89 and norm of the gradient 2.0 × 10 −4 .

Dosing scenario A with linear loading phase
The choice of the reference function B re f impacts the computed optimal doses. In Fig. 2c, the two-week loading phase is modeled with a linearly decaying function instead of the previous quadratic approach Eq. (2). We observe that the optimal loading doses are different, compare Fig. 2d. NONMEM computes the optimal doses within 0.17 s with OFV of 4.39, and the norm of the gradient is 6.8 × 10 −6 .

Dosing scenario A with IV infusion
For the sake of completeness, we consider the dosing scenario A with IV infusion for 1 h each day instead of the IV bolus administration, again for the quadratic reference function Eq. (2). The computation of the optimal doses requires 0.09 s and provides the OFV of 3.82 with norm of the gradient of 1.2 × 10 −5 . The optimal solution is depicted in Fig.  2e and f, closely resembling the IV bolus scenario ( Fig. 2a  and b).

Dosing scenario B
For illustrational purposes, we consider a different dosing scenario allowing more frequent dose changes, namely a daily administration where the dose is fixed for only two days instead of 7 days as in dosing scenario A. The optimal solution is computed within 1.59 s with OFV of 0.06 and norm of the gradient 1.6 × 10 −7 . The results are displayed in Fig. 3a and b in , the associated response follows the desired one very closely, however, a very large dose on days 7 and 8 is administered. To avoid this, the degree of freedom can be reduced as in dosing scenario A, where the weekly dosing schedule avoids single large peak doses, or the maximal dose, that can be administered, can be limited, such as in the following dosing scenario C.

Dosing scenario C
An additional upper bound of 10 for the doses is chosen to avoid large doses as occuring in dosing scenario B. In Fig. 3c and d, we observe that the associated response still follows the desired one closely. The optimal solution is computed within 1.71 s with OFV of 0.45 and norm of the gradient 5.4 × 10 −7 .

Impact of the problem setup on the optimal doses
The five investigated optimal dosing tasks for the motivating example provide different optimal doses. Thus, these illustrate the impact of specific choices, such as the reference function or the number of repeated dose administrations, in the problem setup of the optimal dosing task. The therapeutic goal should guarantee a smooth transition from the elevated biomarker level to the reference range. Sharp bends like the junction between the linear loading phase and the constant maintenance phase can result in large differences of subsequent doses. Large doses can be limited by the maximal tolerable dose D max in the problem setup.
In addition, we emphasize that not necessarily the dosing scenario providing the smallest possible OFV (which is dosing scenario B here) would be considered the best choice. The OFV can be always improved by increasing the degree of freedom, i.e., the number of doses to be optimized, such as when considering scenario B instead of A. Reducing the degree of freedom usually stabilizes the underlying optimization problem and provides clinically more realistic solutions with less frequent dose changes. A reasonable aim can be to achieve a small OFV while keeping the number of optimized doses low. Fig. 4 Bispecific antibody example. a shows the ternary drug-receptor complex RC AB for the optimal solution in black and the reference value 10 (dashed red). b depicts the concentration C as semilogarithmic plot for the optimal solution in blue and the administered optimal dose D * at the dosing time points (red crosses) (Color figure online) In general, the therapeutic goal and the dosing scenario should be discussed in collaboration with clinicians. The reference function needs to be thoughtfully designed to match the therapeutic goal for the patient and the structural behavior of the PKPD model dynamics.

A bispecific antibody model
In this example, the full bispecific TMDD model presented in [21] is considered. The chosen observation interval is [0, 140] days with oral administration of the same dose at days 0, 48, 96. Initially, the system is at baseline, i.e., R A (0) = k syn A /k deg A and R B (0) = k syn B /k deg B and all others are 0. The theraupeutic goal is to keep the ternary drug-receptor complex RC AB at the reference value min(k syn A /k deg A , k syn B /k deg B ), compare [21], throughout the entire interval. The parameter values are taken from a simulation study [21].
Starting from an initial guess for the dose of 800, NON-MEM computes the optimal dose D * = 663.78 within 0.27 s providing the OFV of 5.02, and the norm of the gradient is 1.5 × 10 −5 . The optimal dose and resulting ternary complex can be seen in Fig. 4.

Reaching a target AUC
A frequent task, e.g., in therapeutic drug monitoring, is to reach a certain target AUC of a drug concentration [26,27]. For the third example, the dosing scenario assumes a fixed dose of 300 that has been administered twice at time points 0 and 6 h, and the subsequent dose, which is administered four times after 12, 18, 24 and 36 h, is optimized such that the patient reaches the target AUC of 985 at final time T = 48 h. The PK model is a one-compartment with oral administration.

A delay differential equation example
Delays in PKPD models can be described by delay differential equations [23] and handled in NONMEM, see [24]. As an example, we consider a model for rheumatoid arthritis in collagen-induced arthritic mice [22], see Eqs. (63)-(67) in [23], equipped with a one-compartment PK and IV bolus dosing. The total arthritic score (TAS), given by the overall description of inflammation and the bone destruction, shall follow a linear phase up to day 15 followed by a constant phase, i.e., AK S re f (t) = 2 1 + exp(−2(t − 15)) .
Then, the additional Eq. (9) for integrating the objective function in NONMEM is given as which provides the OFV obj(T , D) as in Eq. (11). NON-MEM computes the optimal dose D * = 3.67 within 0.12 seconds with OFV of 2.03 and norm of the gradient 1.1 × 10 −6 . The optimal solution is displayed in Fig. 6. Note that this desired response was chosen for illustrational purposes based on the assumption to aim at maintaining a tolerable disease burden.

Discussion
Optimal control is an emerging discipline in pharmacometrics. Typically, such approaches consider a continuous time-dependent control function, e.g., the drug concentration, and apply Pontryagin's maximum principle to derive an update formula for the control, which is then computed using a forward-backward sweep algorithm. In contrast, the major novelty of the OptiDose concept is to directly optimize the drug doses which generate the drug concentration. The resulting finite-dimensional optimal control problem allows to efficiently compute the gradient of the objective function and to apply robust algorithms from finite-dimensional optimization, such as quasi-Newton methods. Fig. 6 Rheumatoid arthritis example. In a, the resulting drug concentration C (blue) from the optimal dose D * (red crosses) is displayed. In b, one sees the associated TAS (black) associated with the optimal dos-ing and the reference TAS (dashed red) which was aimed for. In c, the associated AKS (black) and reference AKS (dashed red) are displayed (Color figure online) In this paper, we elaborated how to set up and solve an optimal dosing task in NONMEM based on a reformulation of the optimal control problem of OptiDose. For a given dosing scenario, the optimal doses, which bring the model response as closely as possible to a desired therapeutic goal, can be computed efficiently in NONMEM utilizing standard commands. The presented modeling and optimization approach aims to improve existing trial-and-error procedures in drug dosing, and to provide additional insights into the PKPD model behavior. It enables an unbiased dose selection to reach a specific therapeutic goal, and thereby enhances optimal care for each individual patient. However, user-defined items such as the maximal tolerable doses, the number of repeated dose administrations and the reference function characterizing the therapeutic goal have an impact on the computed optimal doses. Therefore, these should be chosen thoughtfully, e.g., in collaboration with clinicians, when setting up the optimization problem. A reasonable approach seems to aim for a small objective function value while considering a clinically realistic dosing scenario with an adequate number of doses to be optimized.
An indispensable extension to the method will be to compute optimal doses for a target population not solely based on the model parameters of a typical individual, but to include inter-individual variability. In [18], the dose optimization for a target population is addressed based on a target variable and a risk (i.e., objective) function. The target variable is a scalar number, e.g., the steady state concentration, whereas the reference function of OptiDose facilitates a time-dependent therapeutic goal which is present in many PD applications. Incorporating the population approach into the OptiDose concept in NONMEM will allow to address important questions arising in drug development. Further interesting and important issues to address in future work are to perform a sensitivity analysis for the computed optimal doses to account for variability and uncertainty in the model parameters, and to include state constraints in the optimization problem.