Power assessment for hierarchical combination endpoints using joint modelling of repeated time-to-event and time-to-event models versus Finkelstein-Schoenfeld method
Camille Vong (1) Steve Riley (2) and Lutz O. Harnisch (3)
Pfizer Inc, (1) Clinical Pharmacology, Pharmacometrics , Cambridge, MA, USA; (2) Clinical Pharmacology, Groton, CT, USA; (3) Clinical Pharmacology, Pharmacometrics, Sandwich, UK
Objectives: Cardiovascular (CV) clinical trials often assess therapeutic benefit based on a survival event such as death. Additional reportable clinical outcomes preceding death, such as nonfatal myocardial infarction, hospitalization for heart failure, increase in blood pressure, etc. could be considered in combination with survival, while still preserving the hierarchy of their clinical importance in hypothesis testing, especially under a rare disease condition, when the number of patients is limited to investigate a treatment effect. Finkelstein-Schoenfeld (FS) [1] proposed a non-parametric test based on a score derived from 1) subject-to-subject comparison within the same stratum of their time to an event, and 2) if both subjects were censored, the comparison of the longitudinal measure of an ancillary endpoint. To differentiate doses, hence establishing a dose-/exposure-response relationship, the FS as a pairwise comparison method, requires multiple subgroup tests for which trials may be insufficiently powered. Additionally, FS ignores the assessment of the ancillary endpoint in patients who die in the trial, hence reducing the information about a possible correlation between the two endpoints. The objective of this work is to compare power performances to detect drug effect of joint PK/PD models of a mortality time-to event endpoint, combined with a repeated-time-to-event model (RTTE+TTE) of hospitalization frequency related to CV events and competing models for the purpose of informing a dose recommendation for a rare disease.
Methods: Simulated PK concentration, mortality and hospitalization data of a 30-month Phase 3 controlled trial with approximately 400 subjects enrolled in a 2:1:2 ratio (placebo:low:high), were generated from 100 stochastic simulations using the MTIME method in $DES [2] for three different drug effect scenarios: (a) similar for placebo and low dose, (b) similar for low and high dose, and (c) monotonic Emax relationship between placebo, low and high doses.
Simulated data were analysed in NONMEM 7.3 [3] with the following structural models: a time-to-event (TTE) model, a repeated time-to-event (RTTE) model, a time-to-event model with hospitalization frequency as a time-varying covariate (TTE-COV), and two RTTE+TTE models linked (a) by an individual exposure (RTTE+TTE 1), (b) by a common hazard with a scaling factor between the 2 mechanisms (RTTE+TTE 2). Model evaluation was carried out through simulation-based Kaplan Meier representations, binned and kernel hazard VPCs [4], and aiming for model stability. Power versus sample size curves for each model were calculated using the parametric power estimation (PPE) algorithm [5]. FS analytics were generated in SAS 9.4 [6] as the reference power for each scenario.
Results: For all 3 scenarios investigated, the median estimated PPE curves were in general in the following order: TTE, FS, TTE-COV, RTTE or RTTE+TTE 1, RTTE+TTE 2. For instance, the power to detect drug effect at the original simulated sample size, was 30%, 44%, 54%, 55%, 59%, and 72%, respectively for scenario (c). The convergence rate for RTTE+TTE 1 and 2 were 78 and 91%, respectively. In scenarios (a) and (b), EC50 was estimated with mean precision of 16.4% and 156.5%, respectively. Type I error rates for the respective models were found to be 1%, 4%, 6%, 7% and 5%, respectively.
Conclusions: Using both survival and hospitalization data in patients who died or were otherwise censored suggests that the power to detect a drug effect can be substantially increased using the proposed joint PK/PD models. RTTE+TTE-type models demonstrate multiple benefits, such as a higher power by enabling a two-dimensional evaluation of an exposure-response relationship and also by utilizing all available information for each patient. Thus, they may be considered for smaller sample sizes to detect the same treatment effect in future trials.
References:
[1] Finkelstein DM, Schoenfeld DA. Combining Mortality and Longitudinal Measures in Clinical Trials. Statist. Med. 1999; 18; 1341-1354
[2] Nyberg J. Simulating large time-to-event trials in NONMEM. https://www.page-meeting.org/default.asp?abstract=3166
[3] Beal SL, Sheiner LB, Boeckmann AJ & Bauer RJ (Eds.) NONMEM Users Guides. 1989-2011. Icon Development Solutions, Ellicott City, Maryland, USA.
[4] Huh Y, Hutmacher MM. Application of a hazard-based visual predictive check to evaluate parametric hazard models. J Pharmacokinet Pharmacodyn. 2016 Feb;43(1):57-71.
[5] Ueckert S. Accelerating Monte-Carlo Power Studies through Parametric Power Estimation. J Pharmacokinet Pharmacodyn. 2016 Apr;43(2):223-34.
[6] SAS® Visual Analytics 7.4: User’s Guide, SAS Institute Inc. 2017. SAS® Visual Analytics 7.4: User’s Guide. Cary, NC: SAS Institute Inc.