Stochastic Gate Neural Networks for Automatic Covariate Selection in Pharmacometrics Population Modeling
Marija Kekic(1), Oleg Stepanov(2), Andrzej Nowojewski(3), Sam Richardson(3), Itziar Irurzun Arana(2), Jacob Leander(4), Diansong Zhou(5), Weifeng Tang(6), Richard Dearden(3), Megan Gibbs(6)
(1)Imaging & Data Analytics, Clinical Pharmacology & Safety Sciences, R&D BioPharmaceuticals, AstraZeneca, Barcelona, Spain; (2)Clinical Pharmacology & Quantitative Pharmacology, Clinical Pharmacology & Safety Sciences, R&D BioPharmaceuticals, AstraZeneca, Cambridge, UK; (3)Imaging & Data Analytics, Clinical Pharmacology & Safety Sciences, R&D BioPharmaceuticals, AstraZeneca, Cambridge, UK; (4)Clinical Pharmacology & Quantitative Pharmacology, Clinical Pharmacology & Safety Sciences, R&D BioPharmaceuticals, AstraZeneca, Gothenburg, Sweden; (5)Clinical Pharmacology & Quantitative Pharmacology, Clinical Pharmacology & Safety Sciences, R&D BioPharmaceuticals, AstraZeneca, Waltham, MA, USA; (6)Clinical Pharmacology & Quantitative Pharmacology, Clinical Pharmacology & Safety Sciences, R&D BioPharmaceuticals, AstraZeneca, Gaithersburg, MD, USA
Introduction:
Population pharmacokinetic (PK) models describe the behavior of drugs in the body and are usually constructed within a nonlinear mixed-effects framework. The modeling process typically unfolds in two steps; first, a base model is developed where the type of absorption, clearance, number of disposition compartments as well as the specification of the intraindividual and residual random effects is chosen. The second stage involves searching for patient characteristics (covariates) that might explain part of the variability on the PK.
Covariates are chosen based on clinical and statistical relevance. The latter is typically determined by a stepwise covariate modeling (SCM) approach [1]. Frequently, a preliminary selection based on visual exploratory plots and/or univariate analysis is often conducted, especially when dealing with numerous potential covariates.
Recently, there have been attempts to employ fast Machine Learning (ML) methods to pre-select relevant covariates by searching for patterns in estimated individual PK parameters [2, 3]. This covariate selection is usually done in two steps: first, a ML algorithm is trained to predict individual PK parameters from patient covariates, and later the feature importance methods are used to identify the most influential features. The objective of this study is to explore Neural Networks (NN) with Stochastic Gates [4] layers, which provide a one-step training and feature selection algorithm.
Objectives:
- Perform an ablation study of Stochastic Gates NN on synthetic data varying the covariate structure of the model.
- Validate the algorithm on a real dataset: compare obtained covariates with the covariate model for tixagevimab/cilgavimab interim PK analysis [5].
Methods:
As in [2, 3], we fit a base covariate-free model to obtain estimates of individual random effects (ETAs) for which we use NONMEM software. Instead of using empirical Bayes estimates for each individual ETA, we use samples from conditional distributions, as it has been proven to be more reliable in cases of high shrinkage [6].
A simple 3-layer NN with Stochastic Gate Layer implemented in PyTorch is used to predict individual ETA values from patient covariates. The layer introduces an additional hyperparameter (lambda) controlling the penalization for the number of input covariates and needs to be separately tuned.
The final gates probabilities, representing how impactful the covariate is on the given PK parameter, are obtained as the mean value of a 5-fold cross-validation.
We tested the method on synthetic data, following the approach of Sibieude et al. [2], where different scenarios of covariate numbers, correlations, and effect sizes are simulated.
Finally, we applied the algorithm to tixagevimab/cilgavimab interim data [5] and compared the obtained set of covariates to the ones chosen by a pharmacometrician expert.
Results:
Synthetic data are used to both select the lambda parameter and to identify the limits of this approach. We found that a lambda parameter that is 15-25% of the total variance of the ETA values can keep all relevant covariates while minimizing falsely selected ones.
Furthermore, we found that the number of covariates or their correlation were not detrimental for the algorithm, but the covariate effect size, high overall variance of PK parameters, and imbalances in covariate distributions do affect the algorithm’s performance.
The approach was subsequently applied to the interim data of tixagevimab/cilgavimab, where we tested 14 categorical and 11 continuous covariates. The algorithm identified 10 relevant covariate-PK parameter pairs out of a total of 100 possible pairs; including all 5 pairs previously chosen by a pharmacometrician. The selection of additional covariates indicates the necessity for a domain expert to further refine the covariate set.
Conclusion:
We have successfully applied a neural network with an embedded feature selection layer to identify relevant covariates in both simulated and real data sets, allowing for fast selection of potentially relevant covariates and/or identifying residual dependencies on covariates already introduced in the model.
Another usage of the embedded feature selection layer is that it can be natively integrated into rapidly emerging Neural ODE-enhanced PK modelling [7, 8] that are planned for investigation in the future.
References:
- Lindbom L, Ribbing J, Jonsson EN. Perl-speaks-NONMEM (PsN) – a Perl module for NONMEM related programming. Comput Methods Programs Biomed. 2004; 75: 85-94.
- Fast screening of covariates in population models empowered by machine learning. , Sibieude E, Khandelwal A, Hesthaven JS, Girard P, Terranova N. J Pharmacokinet Pharmacodyn. 2021;48(4):597-609. doi:10.1007/s10928-021-09757-w,
- Ogami, C., Tsuji, Y., Seki, H., Kawano, H., To, H., Matsumoto, Y. and Hosono, H. (2021), An artificial neural network−pharmacokinetic model and its interpretation using Shapley additive explanations. CPT Pharmacometrics Syst Pharmacol, 10: 760-768. https://doi.org/10.1002/psp4.12643
- Feature Selection using Stochastic Gates, Yamada, Y., Lindenbaum, O., Negahban, S., & Kluger, Y, Proceedings of Machine Learning and Systems 2020, https://github.com/runopti/stg
- European Medicines Agency Public Assessment Report for Evusheld, 2022 https://www.ema.europa.eu/en/documents/assessment-report/evusheld-epar-public-assessment-report_en.pdf.
- Lavielle M, Ribba B. Enhanced Method for Diagnosing Pharmacometric Models: Random Sampling from Conditional Distributions. Pharm Res. 2016 Dec;33(12):2979-2988. doi: 10.1007/s11095-016-2020-3. Epub 2016 Sep 7. PMID: 27604892.
- Bräm, D.S., Nahum, U., Schropp, J. et al. Low-dimensional neural ODEs and their application in pharmacokinetics. J Pharmacokinet Pharmacodyn (2023). https://doi.org/10.1007/s10928-023-09886-4
- Janssen A, Leebeek FW, Cnossen MH, Mathôt RA, for the OPTI-CLOT study group and SYMPHONY consortium. Deep compartment models: A deep learning approach for the reliable prediction of time-series data in pharmacokinetic modeling. CPT Pharmacometrics Syst Pharmacol. 2022; 11: 934-945. doi:10.1002/psp4.12808
Financial disclosure statement for all authors: All authors were employees of AstraZeneca at the time of study and may own stocks or stock options.