Are datasets for NLME models large enough for a bootstrap to provide reliable parameter uncertainty distributions?
Ronald Niebecker (1), Mats O. Karlsson (1)
(1) Department of Pharmaceutical Biosciences, Uppsala University, Sweden.
Objectives: Nonparametric bootstrap is a frequently employed method to determine parameter precision. This work aims to explore whether typical combinations of model complexity and dataset size are compatible with appropriate behaviour of such a bootstrap procedure. It further introduces a method to diagnose whether a bootstrap will not provide appropriate parameter uncertainty distributions.
Methods: Real data (number of model parameters: 7–12, datasets including 59–74 individuals, 2.6–14 observations per individual [1–3]) and simulation examples were investigated. For each investigated scenario, three dOFV distributions were generated. (i) The theoretical dOFV distribution was derived from a Chi square distribution (degrees of freedom=number of model parameters). (ii) A bootstrap was performed. Each bootstrap parameter vector was evaluated on the original dataset and dOFVs relative to the original model fit formed the bootstrap dOFV distribution. (iii) Stochastic simulation and reestimation (SSE) were performed. The OFVs for both the simulation and estimation parameter vectors were evaluated on each simulated dataset and the difference formed the reference dOFV distribution. Confidence intervals (CIs) determined by bootstrap, SSE and log-likelihood profiling (LLP) were compared. The analysis was carried out in NONMEM 7.2 [4] aided by PsN [5].
Results: For investigated real data examples, 27% to 51% of the bootstrap dOFV values exceeded the 95th percentile of the theoretical dOFV distributions. Bootstrap CIs were inflated relative to those derived from SSE. Simulation and reestimation of equal-sized datasets confirmed these findings. For increased dataset sizes, the bootstrap dOFV distribution converged to the theoretical and reference distributions (which were superimposed for all studied datasets). In parallel, bootstrap CIs were more in accordance with those obtained from SSE and LLP. Additional simulations further confirmed the dependency between information in the dataset, difference of the dOFV distributions and quality of CIs.
Conclusions: This analysis showed that bootstrap may be unsuitable already for NLME analyses where datasets would commonly be considered “large enough”. As a diagnostic of inflated CIs, determination of the bootstrap dOFV distribution is recommended.
Acknowledgement: This work was supported by the DDMoRe (www.ddmore.eu) project.
References:
[1] Karlsson et al., J Pharmacokinet Biopharm. 1998;26(2):207–46.
[2] Wählby et al., Br J Clin Pharmacol. 2004;58(4):367–77.
[3] Grasela et al., Dev Pharmacol Ther. 1985;8(6):374–83.
[4] Beal et al., NONMEM user’s guides. Icon Development Solutions, Ellicott City, MD, USA; 1989–2009.
[5] Lindbom et al., Comput Methods Programs Biomed. 2005;79(3):241–57.