2023 - A Coruña - Spain

PAGE 2023: Methodology - Covariate/Variability Models
Yuchen Guo

Generation of realistic virtual adult populations using the NHANES database: a copula approach

Yuchen Guo (1), Laura B. Zwep (1), Tingjie Guo (1), J. G. Coen van Hasselt (1)

(1) Division of Systems Pharmacology and Pharmacy, Leiden Academic Center for Drug Research, Leiden University, Leiden, The Netherlands

Introduction: Generation of virtual patient populations which consist of sets of patient-associated covariates is of relevance for pharmacometric simulations of clinical trials and optimization of dosing strategies. A key hallmark of realistic virtual populations is that the simulated covariates reflect not only the marginal distributions of the target population of interest, but also the dependency structure between the covariates of interest.

Recently, our group has proposed the use of copula models as a relevant strategy to support simulation of virtual patient populations [1]. In this work we demonstrated favorable performance of copula models in comparison to alternative simulation approaches. Moreover, since copula models are distribution-based, they facilitate sharing of patient-specific covariate data within the community.

Objectives: In this study, we aimed to (1) develop a copula-based model to facilitate simulation of adult population, based on the real-world population data available in the NHANES [2] database; (2) demonstrate a typical model development workflow for the development of copula-based virtual patient simulation models.

Methods: In this study, we utilized public data from the NHANES database (2009–2018), a US population-based, cross-sectional study, focusing on adults aged 18-80 years. We selected specific covariates of interest for population pharmacokinetic models, including gender, race, height, body weight, fat, serum creatinine, alanine aminotransferase, aspartate aminotransferase, alkaline phosphatase, albumin and total bilirubin.

The copula was estimated using R and rvinecopulib package [3]. Data were first transformed using the probability integral function based on kernel density estimation, to create variables with uniform distributions. Multiple candidate vine copulas with different collections of bivariate copula families were estimated and compared. Models were assessed based on AIC, BIC, and log-likelihood.

Evaluation of the model was performed using a simulation-based strategy, performing 100 simulations of the NHANES dataset. We compared observed and simulated metrics for mean, standard deviation, and percentiles, to assess to what extent the model could capture the marginal distributions. To assess performance on capturing the dependency structure, we compared observed and simulated correlation coefficients. In addition, we developed a novel two-dimensional metric to quantify overlap of the 95 percentile contours in observed and simulated data.

A web application that interactively outputs the realistic virtual population was developed using the R shiny package.

Results: The adult dataset used for copula model development included 28,059 subjects. Among all the bivariate copula collections, parametric copula family displayed the highest log-likelihood as well as the lowest AIC and BIC, followed by itau and BB copula families. Mean, standard deviation and percentiles of virtual populations simulated from candidate models agreed with the observed population, with relative errors within ±20%. Most of median errors of correlation were within ±0.1 to the actual correlation. As to the overlap percentage of 95th contour diagnostic, vine copula models achieved over 90% percentage overlap in the majority of covariate pairs. Candidate models had similar performances in quantitative metrics and vine copula with parametric pair-copulas was chosen based on selection criteria. Density plots showed the correspondence between the observed population and the virtual population.

Conclusions: We systematically developed and evaluated a copula model for simulation of commonly used covariates in adult individuals, which can be used as part of clinical trial design or dose optimization strategies. Moreover, we described a general workflow for the development of copula models to support virtual patient simulation.



References:
[1] Zwep, Laura B., et al. PAGE 30 (2022) Abstr 10099 [www.page-meeting.org/?abstract=10099]
[2] Centers for Disease Control and Prevention (CDC). National Center for Health Statistics (NCHS). National Health and Nutrition Examination Survey Data. Hyattsville, MD: U.S. Department of Health and Human Services, Centers for Disease Control and Prevention [https://wwwn.cdc.gov/nchs/nhanes/Default.aspx]
[3] Thomas Nagler and Thibault Vatter (2022). rvinecopulib: High Performance Algorithms for Vine Copula Modeling. R package version  0.6.2.1.3. [https://CRAN.R-project.org/package=rvinecopulib]


Reference: PAGE 31 (2023) Abstr 10418 [www.page-meeting.org/?abstract=10418]
Poster: Methodology - Covariate/Variability Models
Click to open PDF poster/presentation (click to open)
Top