Performance of Machine Learning Algorithms for Model Selection
Xinnong Li (1), Mark Sale (2), Keith Nieforth (2), James Craig (2), Fenggong Wang (3), David Solit (4), Kairui Feng (3), Meng Hu (3), Robert Bies(1), Liang Zhao (3)
(1) University at Buffalo, (2) Certara USA. (3) Office of Research and Standards, Office of Generic Drugs, Center for Drug Evaluation and Research, U.S. Food and Drug Administration (4) Memorial Sloan Kettering
Introduction/Objectives: pyDarwin (1) is an open source python package for population PK model selection. The objective is to evaluate performance (robustness and efficiency) of five machine learning algorithms for population pharmacokinetic (PK) model selection, compared to exhaustive search.
Methods:
Data from a study of dimethylaminoethylamino-17-demethoxygeldanamycin (DMAG) were available and previously reported (2). 951 observed concentrations were available from 66 subjects. Multiple daily doses were administered intravenously, with sample collection for up to 102 hours. Ages ranged from 28 to 82 years. Weight ranged from 48 to 137 kg, with 39% (26 of 66) male and 61% (40 of 66) female. Renal function was normal.
To perform a search for the optimal model to describe these data, a “search space” was defined, which included:
- Number of compartments (1|2|3)
- Central and peripheral compartments (V2 and V3) as a power function of weight
- Central and intercompartmental (Q2 and Q3) clearance as a power function of weight
- Central Volume as a function of age and sex
- Central Clearance as a function of age, sex and serum creatinine
- Between occasion variability on clearance, Q2, central volume, and V2
- Initial estimate for clearance
- Residual error (additive vs combined additive and proportional)
Each algorithm was used to search for the optimal combination of these model features. Machine learning algorithms for this search included:
- Genetic Algorithm (GA)
- Random Forest (RF)
- Random Tree with gradient boosting (RTGB)
- Gaussian Process (GP)
- Particle Swarm Optimization (PSO)
All algorithms were supplemented with “One bit” and “Two bit” local downhill search. This consists of:
- Systematically change each feature (one bit) or all combinations of two features changes (two bit) of the current best models
- Running the resulting models
- Repeat the above steps until improvement is no longer seen
The reference for robustness assessment (did the algorithm find the optimal model?) was exhaustive search (EX)– running all 1,572,864 candidate models. Criteria for efficiency is number of models evaluated to first appearance of the optimal model and total time for search.
Execution was on a 40 core computer running Windows Server, python version 3.11 and NONMEM version 7.4.
Results:
All algorithms, when used with 2 bit downhill step selected the optimal model. PSO with only a 1 bit downhill search failed to find the optimal model. GP was the most efficient based on number of models run (495) before the first appearance of the optimal model, but GA was the most efficient based on time to completion of all generations (20 generations, 322 minutes). Results are given below:
EX: Best Fitness, 1 and 2 bit search = 8201.271; N Models to best = n/a; Best Fitness, 1 bit search =8201.271
GA: Best Fitness, 1 and 2 bit search = 8201.271; N Models to best = 1307; Best Fitness, 1 bit search =8201.271
RF: Best Fitness , 1 and 2 bit search = 8201.271; N Models to best = 880; Best Fitness, 1 bit search =8201.271
GP: Best Fitness, 1 and 2 bit search = 8201.271; N Models to best = 495; Best Fitness, 1 bit search =8201.271
PSO: Best Fitness, 1 and 2 bit search = 8201.271; N Models to best = 1710; Best Fitness, 1 bit search =8220.745
GBRT: Best Fitness, 1 and 2 bit search = 8201.271; N Models to best = 1328; Best Fitness, 1 bit search =8201.271
The final model included:
- 3 compartments
- Power function of centered weight for volume
- Between subject variability on central volume, clearance, peripheral volume 1, inter-compartment clearance 2
- Between occasion variability on central volume, clearance and intercompartment clearance 1
- Combined additive and proportional residual error model
Conclusions: All algorithms when combined with 1 and 2 bit local downhill search identified the optimal model in a search space of 1,572,864 candidate models. PSO failed to identify the optimal model when used with 1 bit local downhill search. GP was the most efficient based on number of models examined, but GA was the most efficient based on total time to completion of search (20 generations).
This work was supported by FDA/NIH grant (Federal Award Identification Number U01FD007355) (Development of a model selection method for population pharmacokinetics analysis by deep-learning based reinforcement learning (RFA-FD-21-027)).
Disclaimer: This presentation reflects the views of the authors and should not be construed to represent FDA’s views or policies.
References:
[1] https://certara.github.io/pyDarwin/html/index.html.
[2] Banerji U, et al. Phase I pharmacokinetic and pharmacodynamic study of 17-allylamino, 17-demethoxygeldanamycin in patients with advanced malignancies. J Clin Oncol. 2005;23:4152–4161. doi: 10.1200/JCO.2005.00.612