Welcome to the Population Approach Group in Europe

Parameter estimation in nonlinear fixed-effects QSP models: benchmark of optimization methods

Sergei Vavilov1, Victor Sokolov1, Kirill Zhudenkov1, Leonid Stolbov1, Kirill Peskov1,2

1 M&S Decisions LLC, Moscow, Russia, 2 Sechenov University, Moscow, Russia

Objectives: Robust and efficient parameter estimation is one of the key steps in quantitative systems pharmacology (QSP) model development. The problem of parameter estimation is solved by maximizing an objective function that describes the likelihood of predicting the observed data, given a suggested set of parameters [1]. Therefore, parameter estimation in QSP models requires an informed choice of a suitable numerical optimization algorithm. This algorithm should be able to provide point estimates and respective confidence intervals based on aggregated experimental data and a likelihood function, given a calibrated ODE-based model. The objective of this report is to compare selected software solutions that provide parameter estimates in nonlinear fixed-effects (NLFE) models.

Methods: A previously published semi-mechanistic QSP model [2] that describes the effect of exenatide, a GLP-1 receptor agonist, on food retention was used to assess the performance of selected optimization methods. The model incorporates 11 ODEs, 2 estimated parameters, and a single observed variable. A set of R-based packages and standalone software were tested to provide parameter estimates for this model.

Most tested optimization methods are embedded in software packages that also include ODE solvers: Monolix 2020R1 utilizes Nelder-Mead optimization for NLFE models [3]; R-based dMod 1.0.2 implements a trust-region optimization algorithm [4,5]; R-based nlmixr 2.0.4.1 supports a wide selection of methods, including Nelder-Mead, several L-BFGS variants, BOBYQA, MMA, and others [6]. In all cases, the performance of the algorithms was assessed with the microbenchmark 1.4-7 package [7].

Results: A review of selected methods provided specific features that are important for each class of the optimization algorithms. Derivative-free optimization methods are suitable when calculation of the gradient and the Hessian of the likelihood function is numerically costly or unreliable, such as in the case of ODEs with discontinuous explicit functions or stiff ODEs. These methods include the Nelder-Mead simplex method [8] and a class of global stochastic methods: simulated annealing implemented in optim [9] and GenSA [10], differential evolution optimization implemented in DEoptim [11]; the latter can use iteration count or other advanced stopping criteria [12].

In contrast, derivative-based methods such as L-BFGS, trust-region and quadratic approximation methods use projected gradient norm or trust region radius as their convergence and termination criteria. Since these methods track an incrementally improving approximation to the Hessian of the likelihood function, they can provide confidence intervals of the resulting parameter estimates without a post-hoc Hessian calculation.

QSP models often include physiologically based constraints on parameter values. Some optimization algorithms (i. e. L-BFGS-B or BOBYQA) can account for these constraints, while others are unconstrained by default and require parameter transformations.

Among selected algorithms, Nelder-Mead, variants of L-BFGS, BOBYQA and MMA all converged to the same optimum given respective tolerances, as well as provided similar confidence intervals, and showed the same performance (median running time range from 700 to 2300 ms, function evaluation counts range from 200 to 500). As expected, global optimization algorithms only managed to converge to the same optimum after a larger number of objective function evaluations (median running time was 178 sec, function evaluation counts range from 8000 to 30 000). However, challenging high-dimensional nonlinear systems may contain multiple local optima in which deterministic optimization algorithms may get stuck [13].

Conclusions: Local derivative-based methods such as Nelder-Mead or L-BFGS-B serve as a good start for estimating parameters in NLFE problems. Depending on the constraints in the parameter space, smoothness of the likelihood function, or the number of local minima, a derivative-free local algorithm (BOBYQA or MMA) or a derivative-free global algorithm (generalized simulated annealing or differential evolution) can perform equally well or better. Further benchmarking of optimization algorithms could involve more advanced QSP models with a larger parameter space and multiple observed variables.

References:
[1] Helmlinger, G., Sokolov, V., Peskov, K., Hallow, K. M., Kosinsky, Y., Voronova, V., Chu, L., Yakovleva, T., Azarov, I., Kaschek, D., Dolgun, A., Schmidt, H., Boulton, D. W., & Penland, R. C. (2019). Quantitative Systems Pharmacology: An Exemplar Model-Building Workflow With Applications in Cardiovascular, Metabolic, and Oncology Drug Development. CPT: pharmacometrics & systems pharmacology, 8(6), 380–395. https://doi.org/10.1002/psp4.12426
[2] Sokolov V. et al., Evaluation of the utility of a MATLAB and R-based workflow for the development of quantitative systems pharmacology (QSP) models. ACoP9, October 2018.
[3] https://monolix.lixoft.com/tasks/population-parameter-estimation-using-saem/
[4] R Core Team (2020). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.
[5] Kaschek D, Mader W, Fehling-Kaschek M, Rosenblatt M, Timmer J (2019). “Dynamic Modeling, Parameter Estimation, and Uncertainty Analysis in R.” Journal of Statistical Software, 88(10), 1–32. doi: 10.18637/jss.v088.i10.
[6] Fidler M, Xiong Y, Schoemaker R, Wilkins J, Trame M, Hooijmaijers R, Post T, Wang W (2021). nlmixr: Nonlinear Mixed Effects Models in Population Pharmacokinetics and Pharmacodynamics. R package version 2.0.4.1, https://CRAN.R-project.org/package=nlmixr
[7] Olaf Mersmann (2019). microbenchmark: Accurate Timing Functions. R package version 1.4-7. https://CRAN.R-project.org/package=microbenchmark
[8] J. A. Nelder, R. Mead. A Simplex Method for Function Minimization. The Computer Journal, Volume 7, Issue 4, January 1965, Pages 308–313, https://doi.org/10.1093/comjnl/7.4.308
[9] Claude J. P. Bélisle. "Convergence Theorems for a Class of Simulated Annealing Algorithms on Rd." Journal of Applied Probability 29, no. 4 (1992): 885-95. Accessed May 11, 2021. doi:10.2307/3214721.
[10] Y. Xiang, S. Gubian. B. Suomela, J. Hoeng (2013). Generalized Simulated Annealing for Efficient Global Optimization: the GenSA Package for R. The R Journal, Volume 5/1, June 2013. https://journal.r-project.org/archive/2013/RJ-2013-002/index.html
[11] Mullen K, Ardia D, Gil D, Windover D, Cline J (2011). “DEoptim: An R Package for Global Optimization by Differential Evolution.” Journal of Statistical Software, 40(6), 1–26. doi: 10.18637/jss.v040.i06
[12] C.G.E. Boender and H.E. Romeijn, Stochastic methods, pp. 829-869 in: R. Horst and P.M. Pardalos (eds.), Handbook of Global Optimization, Kluwer, Dordrecht 1995.
[13] https://machinelearningmastery.com/stochastic-optimization-for-machine-learning/

PAGE 2021: Methodology - Estimation Methods
Sergei Vavilov

Parameter estimation in nonlinear fixed-effects QSP models: benchmark of optimization methods

Reference: PAGE 29 (2021) Abstr 9755 [www.page-meeting.org/?abstract=9755]

Poster: Methodology - Estimation Methods

PDF poster/presentation (click to open)