Model-based analyses for pivotal decisions, with an application to equivalence testing for biosimilars
B. Bieth(1), F. Mentre(2), G. Heimann(1), I. Demin(1), B. Hamren(1), S. Balser(3), D. Renard(1)
(1) Modeling & Simulation, Novartis, Basel, Switzerland; (2) UMR 738, INSERM, and Université Paris Diderot, Paris, France; (3) Clinical Operations & Biostatistics, Sandoz, Holzkirchen, Germany
Objectives: In a drug development context, non-linear mixed effects models (NLME) are routinely used for exploratory analyses. These methods are very powerful but their appropriateness relies on the correctness of the model assumptions. The strict regulatory standards applied during phase III favor the use of analysis methods which are assumption-free, but often less powerful. The objective behind the present work was to use an NLME analysis in a pivotal phase III setting, to take full advantage of the substantial improvement in power, whilst at the same time maintaining the strict regulatory standards for phase III as much as possible.
Methods: The principle of our approach is illustrated in the context of biosimilar equivalence in rheumatoid arthritis, using the American College of Rheumatology 20% (ACR20) response criterion as primary study outcome. The planned model-based analysis would proceed as follows. To prevent against model misspecification, a set of several candidate models is pre-specified to describe the expected time course of ACR20 response. The models considered in this application were of Markov type [1]. Since the study aims to demonstrate equivalence between the originator product and the biosimilar, a key modeling outcome is the mean response rate difference between the two groups at primary end-point. We rely on model averaging [2] to combine the individual model estimates. A confidence interval for the model average estimate can be derived using bootstrap and this confidence interval can serve for formal equivalence testing.
Results: The proposed model-based test was evaluated through simulations and compared to the classical equivalence test based on end-point data only. Operational characteristics, such as type 1 error and power, were of particular interest. This investigation was performed under a range of simulation models and scenarios. Type 1 error appeared to be controlled under the simulation scenarios investigated. The gain in power with the model-based test was substantial compared to the classical equivalence test.
Conclusions: While those simulation results are promising, initial feedback from European health authorities suggested that further work should be undertaken to evaluate the performances of the proposed approach. In particular, the absence of theoretical results to justify type 1 error control appears to be a critical concern deserving careful consideration.
References:
[1] [Lacroix BD., Lovern MR, Stockis A, Sargentini-Maier ML., Karlsson MO. & Friberg LE. (2009)] A Pharmacodynamic Markov Mixed-Effects Model for Determining the Effect of Exposure to Certolizumab Pegol on the ACR20 Score in Patients With Rheumatoid Arthritis, Clin Pharmacol Ther, 86, 387-395.
[2] [Hoeting JA, Madigan D, Raftery A, et al (1999)] Bayesian model averaging: a tutorial. Statist Science; 14: 382-417.